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1. Introduction 

1.1. Determinism versus randomness. A feature of many real-life phenomena in areas as diverse 
as physics, biology, finance, economics, and many others, is the random-like behaviour of processes 
which nevertheless are clearly deterministic. On the level of applications this dual aspect has proved 
very problematic. Specific mathematical models tend to be developed either on the basis that the 
process is deterministic, in which case sophisticated numerical techniques can be used to attempt to 
understand and predict the evolution, or that it is random, in which case probability theory is used to 
model the process. Both approaches lose sight of what is probably the most important and significant 
characteristic of the system which is precisely that it is deterministic and has random-like behaviour. 
The theory of Dynamical Systems has contributed a phenomenal amount of work showing that it 
is perfectly natural for completely deterministic systems to behave in a very random-like way and 
achieving a quite remarkable understanding of the mechanisms by which this occurs. The purpose of 
these notes is to survey some of this research. 

1.2. Nonuniform expansivity. We shall assume that the state space can be represented by a compact 
Riemannian manifold M and that the evolution of the process is given by a map f : M ^ M which 
is piecewise differentiable. Following an approach which goes back at least to the first half of the 20th 
century, we shall discuss how certain statistical properties can be deduced from geometrical assump- 
tions on / formulated explicitly in terms of assumptions on the derivative map Df of /. The basic 
strategy is to construct certain geometrical structures which then imply some statistical/probabilistic 
properties of the dynamics (a striking and pioneering example of this is the work of Hopf on the er- 
godicity of geodesic flows on manifolds of negative curvature fHop391). Research work following 
this basic line of reasoning goes under the heading of Hyperbolic Dynamics and/or Smooth Ergodic 
Theory. 

The main focus of these notes will be on maps which satisfy an asymptotic expansivity condition. 
Definition 1. We say that / : M — > M is (nonuniformly) expanding if there exists A > such that 

^ n— 1 

(*) liminf- VlogllZ)/-! 11-1 >A 

n^oo n ^ — ' •' ^ ' 

i=0 

for almost every x E M. Equivalently, for almost every x S M there exists a constant Cx > such 
that 

n-1 
i=0 

for every n > 1. 
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The definition and the corresponding results can be generalized to the case in which the expansivity 
condition holds only on an invariant set of positive measure instead of on the entire manifold M. See 
also I Alv03 1 for a detailed treatment of the theory of nonuniformly expanding maps. Notice that the 
condition log > 0, which is equivalent to HD/^T^H"^ > 1 and which in turn is equivalent 

to < 1, implies that all vectors in all directions are contracted by the inverse of Dfx and 

thus that all vectors in all directions are expanded by Dfx', the intuitively more obvious condition 
log ll-D/xll > 0, which is equivalent to > 1, implies only that there is at least one direction 

in which vectors are expanded by Dfx- Thus a map is nonuniformly expanding if every vector is 
asymptotically expanded at a uniform exponential rate. The constant Cx can in principle be arbitrarily 
small and indicates that an arbitrarily large number of iterates may be needed before this exponential 
growth becomes apparent. 

In the special case in which condition (*) holds at every point x and the constant C can be chosen 
uniformly positive independent of x we say that / is uniformly expanding. Thus uniformly expand- 
ing is a special case of nonuniformly expanding. The terminology is slightly awkward for historical 
reasons: uniformly expanding maps have traditionally been referred to simply as expanding maps 
whereas this term should more appropriately refer to the more general (i.e. possibly nonuniformly) 
expanding case. We shall generally say that / is strictly nonuniformly expanding if / satisfies con- 
dition (*) but is strictly not uniformly expanding. A basic theme of these notes is to discuss the 
difference between uniformly and nonuniformly expanding maps: how the nonuniformity affects the 
results and the ideas and techniques used in the proofs and how different degrees of nonuniformity can 
me quantified. 

Nonuniform expansivity is a special case of nonuniform hyperbolicity . This concept was first for- 
mulated and studied by Pesin IPes76IIPes77l and has since become one of the main areas of research 
in dynamical systems, see fBunetal89"You95hl lBarPesO n and fBarPesOTI by Barreira and Pesin in 
this volume, for extensive and in-depth surveys. The formal definition is in terms of non-zero Lya- 
punov exponents which means that the tangent bundle can be decomposed into subbundles in which 
vectors either contract or expand at an asymptotically exponential rate. Nonuniform expansivity cor- 
responds to the case in which all the Lyapunov exponents are positive and therefore all vectors expand 
asymptotically at an exponential rate. The natural setting for this situation is that of (non-invertible) 
local diffeomorphisms whereas the theory of nonuniform hyperbolicity has been developed mainly for 
diffeomorphisms (however see also ['Rue82 | and fBarPes04 Section 5.8]). For greater generality, and 
also because this has great importance for applications, we shall also allow various kinds of critical 
and/or singular points for / or its derivative. 

1.3. General overview of the notes. We first review the basic notions of invaiiant measure, ergod- 
icity, mixing, and decay of correlations in order to fix the notation and to motivate the results and 
techniques. In section |2l we discuss the key idea of a Markov Structure and sketch some of the argu- 
ments used to study systems which admit such a structure. In section |3j|l] and |5j we give a historical 
and technical survey of many classes of systems for which results are known, giving references to 
the original proofs whenever possible, and sketching in varying amounts of details the construction 
of Markov structures in such systems. In section |6l we present some recent abstract results which 
go towards a general theory of nonuniformly expanding maps. In section we discuss the impor- 
tant problem of verifying the geometric nonuniform expansivity assumptions in specific classes of 
maps. Finally, in section |8] we make some concluding remarks and present some open questions and 
conjectures. 

The focus on Markov structures is partly a matter of personal preference; in some cases the re- 
sults can be proved and/or were first proved using completely different arguments and techniques. 
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Of particular importance is the so-called Functional-Analytic approach in which the problems are re- 
formulated and reduced to questions about the spectrum of a certain linear operator on some functional 
space. There are several excellent survey texts focussing on this approach, see I B alO 1 llLiv041l Vial . 
In any case, it is hard to see how the study of systems in which the hyperbolicity or expansivity is 
nonuniform can be carried out without constructing or defining some kind of subdivision into subsets 
on which relevant estimates satisfy uniform bounds. The Markov structures to be described below 
provide one very useful way in which this can be done and give some concrete geometrical struc- 
ture. It seems very likely that these structures will prove useful in studying many other features of 
nonuniformly hyperbolic or expanding systems such as their stability, persistence, and even existence 
in particular settings. Another quite different way to partition a set satisfying nonuniform hyperbol- 
icity conditions is with so-called Pesin or regular sets, see |B arPes04 Section 4.5]. These sets play a 
very useful role in the general theory of nonuniform hyperbohcity for diffeomorphisms, for example 
in the construction of the stable and unstable foliations. 

We shall always assume that M is a smooth, compact, Riemannian manifold of dimension d > 1. 
For simplicity we shall call the Riemannian volume Lebesgue measure, denote it by m or | • | and 
assume that it is normalized so that m{M) = \M\ = 1. We let / : M ^ M denote a Lebesgue- 
measurable map. In practice we shall always assume significantly more regularity on /, e.g. that / is 
or at least piecewise C^, but the main definitions apply in the more general case of / measurable. 
All measures on M will be assumed to be defined on the Borel cj-algebra of M. 

1.4. Invariant measures. For a set A G M and a map f : M M we. define f^^{A) = {x : 



for every //-measurable set yl C M. 

A given measure can be invariant for many different maps. For example Lebesgue measure on the 
circle M = is invariant for the identity map f{6) = 6, the rotation f{9) = 6* + a for any a G IR, 
and the covering map f{9) = k9 for any k G INJ. Similarly, for a given point p G M, the Dirac-5 
measure 6p defined by 



is invariant for any map / for which f{p) = p. On the other hand, a given map / can admit many 
invariant measures. For example any probability measure is invariant for the identity map f{x) = x 
and, more generally, any map which admits multiple fixed or periodic points admits as invariant mea- 
sures the Dirac-5 measures supported on such fixed points or their natural generalizations distributed 
along the orbit of the periodic points. There exist also maps that do not admit any invariant probability 
measures. However some mild conditions, e.g. continuity of /, do guarantee that there exists at least 
one. 

A first step in the application of the theory and methods of ergodic theory is to introduce some 
ways of distinguishing between the various invariant measures. We do this by introducing various 
properties which such measures may or may not satisfy. Unless we specify otherwise we shall use ^ 
to denote a generic invariant probability measure for a given unspecified map f : M ^ M. 



fix) G A}. 





p£ A 
p^A 



1.5. Ergodicity. 



NONUNIFORMLY EXPANDING MAPS 



5 



Definition 3. We say that // is ergodic if there does not exist a measurable set A with 

f-\A) = A and /.(A) G (0, 1) 

In other words, any fully invariant set A, i.e. a set for which f^^{A) = A, has either zero or 
full measure. This is a kind of indecomposability property of the measure. If such a set existed, its 
complement B = A^ would also be fully invariant and , in particular, both A and B would be also 
forward invariant: f{A) = A and f{B) = B. Thus no point originating in A could ever intersect B 
and vice-versa and we essentially have two independent dynamical systems. 

Simple examples such as the Dirac-5p measure on a fixed point p are easily shown to be ergodic, 
but in general this is a highly non-trivial property to prove. A lot of the techniques and methods to be 
described below are fundamentally motivated by the basic question of whether some relevant invariant 
measures are ergodic. It is known that Lebesgue measure is ergodic for circle rotations f{9) = 9 + a 
when a is irrational and for covering maps f{9) = k6 when k G INJ is > 2 (the proof of ergodicity for 
the latter case will be sketched below). Irrational circle rotations are very special because they do not 
admit any other invariant measures besides Lebesgue measure. On the other hand covering maps have 
infinitely many periodic points and thus admit infinitely many invariant measures. It is sometimes 
easier to show that certain examples are not ergodic. This is clearly true for example for Lebesgue 
measure and the identity map since any subset is fully invariant. A less trivial example is the map 
/: [0,1] ^ [0,1] given by 



'2x if0<x<l/4 

-2x + l ifl/4<x<l/2 

2X-1/2 ifl/2<x<3/4 

,-2x + 5/2 if3/4<x<l. 

Lebesgue measure is invariant , but the intervals [0,1/2) and [1/2,1] are both backward (and forward) 
invariant. This example can easily be generalized by by defining two different Lebesgue measure 
preserving transformations mapping each of the two subintervals [0, 1/2) and [1/2, 1] into themselves. 

The fundamental role played by the notion of ergodicity is given by the well known and classical 
Birkhojf Ergodic Theorem. We give here only a special case of this result. 

Theorem (fBirSl Bir421). Let f : M ^ M be a measurable map and let be an ergodic invariant 
probability measure for f. Then, for any function f : M ^ R in C^{fJ-), i.e. such that J (pdfi < oo, 
and for almost every x we have 



1 " f 

-^(^(r(x))- J ^df^ 



i=l 

In particular, for any measurable set A C M, letting ip = 1a_ be the characteristic function of A, we 
have for p, almost every x £ M, 

(1) #{i<J<«:/'Me^}^ 

n 

Here #{1 < j < n : f^{x) G ^1} denotes the cardinality of the set of indices j for which 
/■'{x) G A. Thus the average proportion of time which the orbit of a typical point spends in A 
converges precisely to the /i-measure of A. Notice that the convergence of this proportion as n ^ cxd 
is in itself an extremely remarkable and non-intuitive result. The fact that the limit is given a priori 
by fJ.{A) means in particular that this limit is independent of the specific initial condition x. Thus fi- 
ahnost every initial condition has the same statistical distribution in space and this distribution depends 
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only on ^ and not even on the map /, except implicitly for the fact that fi is ergodic and invariant for 

1.6. Absolute continuity. Some care needs to be taken when applying Birkhoff's ergodic theorem 
to maps which admit several ergodic invariant measures. Consider for example the circle map f{9) = 
106. This maps preserves Lebesgue measure and also has several fixed points, e.g. p = 0.2222 . . ., on 
which we can consider the Dimc-6p measure. Both these measures are ergodic. Thus an application of 
Birkhoff's theorem says that "almost every" point spends an average proportion of time converging to 
m{A) in the set A but also that "almost every" point spends an average proportion of time converging 
to 6p{A) in the set A. If m{A) ^ (5p(^) this may appear to generate a contradiction. 

The crucial observation here is that the notion of almost every point is always understood with 
respect to a particular measure. Thus Birkhoff's ergodic theorem asserts that for a given measure 
/z there exists a set M C M with /u(M) = 1 such that the convergence property holds for every 
X G M and in general it may not be possible to identify M explicitly. Conversely, if X e M satisfies 
= then no conclusion can be drawn about whether Q holds for any point of X. Returning 
to the example given above we have the following situation: the convergence Q of the time averages 
to (5p(j4) can be guaranteed only for points belonging to a minimal set of full measure. But in this 
case this set reduces to the single point p for which Q clearly holds. On the other hand the single 
point p clearly has zero Lebesgue measure and thus the convergence © to mi^A) is not guaranteed by 
Birkhoff's Theorem. Thus there is no contradiction. 

An important point therefore is that the information provided by Birkhoff's ergodic theorem de- 
pends on the measure /i under consideration. Based on the premise that Lebesgue is the given "phys- 
ical" measure and that we consider a satisfactory description of the dynamics one which accounts 
for a sufficiently large set of points from the point of view of Lebesgue measure, it is clear that if ix 
is a Dirac-(5 measure on a fixed point it gives essentially no useful information. On the other hand, 
if IX is Lebesgue measure itself then we do get a convergence result that holds for Lebesgue almost 
every starting condition. The invariance of Lebesgue measure is a very special property but much 
more generally we can ask about the existence of ergodic invariant measures [i which are absolutely 
continuous with respect to ra. 

Definition 4. \i is absolutely continuous with respect to m if 

m{A) = implies = 

for every measurable set A C M. 

In this case, Birkhoff's theorem implies that (0 holds for all points belonging to a set M C M with 
^(M) = 1 and the absolute continuity of [i with respect to m therefore implies that m{M) > 0. Thus 
the existence of an ergodic absolutely continuous invariant probability ( acip ) /i implies some control 
over the asymptotic distribution of at least a set of positive Lebesgue measure. It also implies that 
such points tend to have a dynamics which is non-trivial in the sense that it is distributed over some 
relatively large subset of the space as opposed to converging for example to some attracting fixed 
point. Thus it indicates that there is a minimum amount of inherent complexity as well as structure. 
This motivates the basic question: 

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 

This question is already addressed explicitly by Hopf [Hop32| for invertible transformations. Inter- 
estingly he formulates some conditions in terms of the existence of what are essentially some induced 
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transformations, similar in some respects to the Markov structures to be defined below. ' In these 
notes we shall discuss what is effectively a generalization of this basic approach. 

1.7. Mixing. Birkhoff's ergodic theorem is very powerful but it is easy to see that the asymptotic 
space distribution given by Q does not necessarily tell the whole story about the dynamics of a given 
map /. Indeed these conclusions depend not on / but simply on the fact that Lebesgue measure 
is invariant and ergodic. Thus from this point of view the dynamics of an irrational circle rotation 
f{6) = 9 + a and of the map f{6) = 26 are indistinguishable. However it is clear that they give 
rise to very different kinds of dynamics. In one case for example, nearby points remain nearby for 
all time, whereas in the other they tend to move apart at an exponential speed. This creates a kind of 
unpredictability in one case which is not present in the other. 

Definition 5. We say that an invariant probability measure /i is mixing if 

\^,{Ar^^\B))-^,{A)^,m^^ 

as n ^ oo, for all measurable sets A, S C M. 

Notice that mixing implies ergodicity and is therefore a stronger property. Thus a natural follow up 
to question 1 is the following. Suppose that / admits an ergodic acip fi. 



2^Unde^wha^ondition^^^mixm 

Early work in ergodic theory in the 1940's considered the question of the genericity of the mixing 
property is in various spaces of systems | Hal44l lRoh48IIRos56llHol57UKacKes58l but, as with ergod- 
icity, in specific classes of systems it is generally easier to show that a system is not mixing rather 
than that it is mixing. For example it is immediate that irrational circle rotations are not mixing. On 
the other hand it is non-trivial that maps of the form f{9) = k9 for integers k > 2 are mixing. 

To develop an intuition for the concept of mixing, notice that mixing is equivalent to the condition 



KB) 







as n ^ cxD, for all measurable sets A,B CI M, with fi{B) ^ 0. In this form there are two natural 
interpretations of mixing, one geometrical and one probabilistic. From a geometrical point of view, 
recall that = fJ,{B) by the invariance of the measure. Then one can think of f^"-{B) as 

a "redistribution of mass" and the mixing condition says that for large n the proportion of /^"(i?) 
which intersects A is just proportional to the measure of A. In other words f^"-{B) is spreading 
itself uniformly with respect to the measure fi. A more probabilistic point of view is to think of 
p.{A n f''"'{B))/fi{B) as the conditional probability of having x & A given that /"(x) G B, i.e. the 
probability that the occurrence of the event B today is a consequence of the occurrence of the event A 
n steps in the past. The mixing condition then says that this probability converges to the probability 
of A, i.e., asymptotically, there is no causal relation between the two events. This is why we say that 
a mixing system exhibits stochastic-like or random-like behaviour. 



'Hopf's result is the following: suppose that for every measurable partition V of the manifold A/ and every stopping 
time function p such that the images f^^^^ (oj) for uj E V are all disjoint, the union of all images has full measure. Then / 
admits an absolutely continuous invariant probability measure. 
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1.8. Decay of correlations. It turns out that mixing is indeed a quite generic property at least un- 
der certain assumptions which will generally hold in the examples we shall be interested in. Thus 
apparently very different systems admit mixing acip's and become, in some sense, statistically indis- 
tinguishable at this level of description. Thus it is natural to want to dig deeper in an attempt relate 
finer statistical properties with specific geometric characteristics of systems under considerations. One 
way to do this is to try to distinguish systems which mix at different speeds. To formalize this idea 
we need to generalize the definition of mixing. Notice first of all that the original definition can be 
written in integral form as 





lyln/-"(B)^/^ - y ^AdfJ- J tsd^ 
where Ix denotes the characteristic function of the set X. This can be written in the equivalent form 







and this last formulation now admits a natural generalization by replacing the characteristic functions 
with arbitrary measurable functions. 

Definition 6. For real valued measurable functions 99, : M ^ IR we define the correlation function 
2 



'ipdfi / Lfdfj, 



In this context, the functions (p and tp are often called observables. If ^ is mixing, the correlation 
function decays to zero whenever the observables (f), ijj are characteristic functions. It is possible to 
show that indeed it decays also for many other classes of functions. We then have the following 
very natural question. Suppose that the measure /i is mixing, fix two observables ip, ip, and let C„ = 

Cn(¥',V')- 



The idea behind this question is that a system may have an intrinsic rate of mixing which reflects 
some characteristic geometrical structures. It turns out that an intrinsic rate does sometimes exist 
and is in some cases possible to determine, but only by restricting to a suitable class of observables. 
Indeed, a classical result says that even in the "best" cases it is possible to choose subsets A, B such 
that the correlation function C„(1a,1b) of the corresponding characteristic functions decays at an 
arbitrarily slow rate. Instead positive results exist in many cases by restricting to, for example, the 
space of observables of bounded variation, or Holder continuous, or even continuous with non-Holder 
modulus of continuity. Once the space Ti of observables has been fixed, the goal is to show that there 
exists a sequence 7n ^ (e.g. 7„ = e^"" or 7„ = for some a > 0) depending only on / and 
H, such that for any two ifjip ^ TL there exists a constant C = C{(p, ip) (generally depending on the 
observables (p, ip) such that 

for all n > 1. Ideally we would like to show that C„ actually decays like 7„, i.e. to have both lower and 
uppoer bounds, but this is known only in some very particular cases. Most known results at present 

The derivation of the correlation function from the definition of mixing as given here does not perhaps correspond to 
the historical development. I believe that the notion of decay of correlation arose in the context of statistical mechanics and 
was not directly linked to abstract dynamical systems framework until the work of Bowen, Lebowitz, Ruelle and Sinai in 
the 1960's and 1970's. LSin68-Bow70.Sin72.PenLeb74.Bow75J 
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are upper bounds and thus when we say that the correlation functions decays at a certain rate we will 

usually mean that it decays at least at that rate. Also, most known results deal with Holder continuous 
observables and thus, to simphfy the presentation, we shall assume assume that we are deahng with 
this class unless we mention otherwise. 

We shall discuss below several examples of systems whose correlations decay at different rates, 
for example exponential, polynomial or even logarithmic, and a basic theme of these notes will be 
gain some understanding about how and why such differences occur and what this tells us about the 
system. 

2. Markov Structures 

Definition 7. / : M — > M is (or admits) a Markov map if there exists a finite or countable partition V 
(mod 0) of M into open sets with smooth boundaries such that / (a;) = M for every partition element 
Lo eV and /l^; is a continuous non-singular bijection. 

We recall that a partition mod of M means that Lebesgue almost every point belongs to the 
interior of some partition elements. Also, f\^ is non-singular if |^| > implies |/(^)| > for every 
(measurable) A c oj. These two conditions together immediately imply that the full forward orbit of 
almost every point always lies in the interior of some partition element. The condition /(w) = M is 
a particularly strong version of what is generally referred to as the Markov property where it is only 
required that the image of each a; be a continuous non-singular bijection onto some union of partition 
elements and not necessarily all of M. The stronger requirement we use here is sometimes called a 
Bernoulli property. A significant generaUzation of this definition allows the partition element to be 
just measurable sets and not necessarily open; the general results to be given below apply in this case 
also. However we shall not need this for any of the applications which we shall discuss. 

A natural but extremely far-reaching generalization of the notion of a Markov map is the following. 



Definition 8. / : M — > M admits an induced Markov map if there exists an open set A C M, a 

partition V (mod 0) of A and a return time function R : A ^ h\, piecewise constant on each element 
of V, such that the induced map F : A ^ A defined by F{x) = /^(^) is a Markov map. 

Again, the condition that A is open is not strictly necessary. For the rest of this section we shall 
suppose that F : A ^ A is an induced Markov map associated to some map f : M ^ M. We 
call ■p the Markov partition associated to the Markov map F. Clearly if / is a Markov map to begin 
with, it trivially admits an induced Markov map with A = M and R = 1. Since V is assumed to be 
countable, we can define an indexing set w = {0, 1, 2, . . .} of the Markov partition V. Then, for any 
finite sequence aoaia2n . . . , a„ with Oj G I, we can define the cylinder set of order n by 

'^ioL..a„{^:^H^)G^a.forO<z<n}. 

Inductively, given Waoai...a„_i, then Waoai...a„ is the part of Waoai...a„_i mapped to by F". The 
cylinder sets define refinements of the partition V. We let a;^'^^ denote generic elements of V^^^ = V 
and u;^"^ denote generic elements of 'P^^\ Notice that by the non-singularity of the map F on each 
partition element and the fact that P is a partition mod 0, it follows that each P^") is also a partition 
mod and that Lebesgue almost every point in A falls in the interior of some partition element of V 
for all future iterates. In particular almost every x G A has an associated infinite symboUc sequence 
a{x) determined by the future iterates of x in relation to the partition V. To get the much more 
sophisticated results on the statistical properties of / we need first of all the following two additional 
conditions. 
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Definition 9. F : A ^ A has integrable or summable return times if 

/ R{x)dx = \uj\R{ijj) < oo. 

Definition 10. F : A ^ A has the geometric self-similarity property (or, more prosaically, the 
(volume) bounded distortion property) if there exists a constant V > such that for all n > 1 and any 
measurable subset a;^"^ C w*-"^ € "P we have 

(2) :^^r^ < . r ^ < ^^7 



This means that the relative measure of subsets of a cylinder set of any level n are preserved up to 
some factor V under iteration by A crucial observation here is that the constant V is independent 
of n. Thus in some sense the geometrical structure of any subset of A re-occurs at every scale inside 
each partition element of up to some bounded distortion factor. This is in principle a very strong 
condition but we shall se below that it is possible to verify it in many situations. We shall discuss in 
the next section some techniques for verifying this condition in practice. First of all we state the first 
result of this section. 

Tlieorem 1. Suppose that f : M ^ M admits an induced Markov map satisfying the geometric 
self- similarity property and having summable return times. Then it admits an ergodic absolutely 
continuous invariant probability measure /x. 

This result goes back to the 1950's and is often referred to as the Folklore Theorem of dynamics. 
We will sketch below the main ideas of the proof. First however we state a much more recent result 
which applies in the same setting but takes the conclusions much further in the direction of mixing and 
rates of decay of correlations. First of all we shall assume without loss of generality that the greatest 
common divisor of all values taken by the return time function i? is 1. If this were not the case all 
return times would be multiples of some integer k>2 and the measure // given by the Theorem stated 
above would clearly not be mixing. If this is the case however, we could just consider the map f = 
and the results to be stated below will apply to / instead of /. We define the tail of the return times as 
the measure of the set 

Rn = {x £ A : R{x) > n} 

of points whose return times is strictly larger than n. The integrability condition implies that R{x) < 
oo for almost every point and thus 

\Rn\ ^ 

as n — > OO. However there is a range of possible rates of decay of \Rn\ all of which are compatible 
with the integrability condition. L.-S. Young observed and proved that a bound on the decay of 
correlations for Holder continuous observables can be obtained from bounds on the rate of decay of 
the tail of the return times. 

Theorem 2 ( I You98 [| You99 1 ) . Suppose that f : M ^ M admits an induced Markov map satisfying 
the geometric self-similarity property and having summable return times. Then it admits an ergodic 
(and mixing) absolutely continuous invariant probability measure. Moreover the correlation function 
for Holder continuous observables satisfies the following bounds: 

Exponential tail: If 3a > such that \Rn\ = ©(e""), then 3 a > such that Cn = 0{e~^'^). 
Polynomial tail: //3a > 1 such that \Rn\ = 0{n'^), then Cn = ©(rj-^+i). 



NONUNIFORMLY EXPANDING MAPS 



11 



Other papers have also addressed the question of the decay of correlations for similar setups mainly 
using spectral operator methods [You98 Bre99 MauOla BreFerGal99 MauOlb |. We remark that the 
results about the rates of decay of correlations generally require an a priori slightly stronger form of 
bounded distortion than that given in The proof in |You99| uses a very geometrical/probabilistic 
coupling argument which appears to be quite versatile and flexible. Variations of the argument have 
been applied to prove the following generalizations which apply in the same setting as above (in both 
cases we state only a particular case of the theorems proved in the cited papers). 

The first one extends Young's result to arbitrarily slow rates of decay. We say that p : IR+ ^ IR+ is 
slowly varying (see IAar97ll ) if for all y > we have lim3;_>oo p{xy) / p{x) = 0. A simple example of 
a slowly varying function is the function p{x) = e(^°§^)/(^°sioga;)_ Let i?„ = Ylh>n ^fi- 

Theorem 3 (Hol04). The correlation function for Holder continuous observables satisfies the follow- 
ing bound. 

Slowly varying tail: If Rn = 0(p{n)) where p is a monotonically decreasing to zero, slowly 
varying, C°° function, then Cn = 0{p{n)). 

The second extends Young's result to observables with very weak, non-Holder, modulus of conti- 
nuity. We say that ijj : I ^ U has a logarithmic modulus of continuity 7 if there exists C > such 
that for all j;, y S / we have 

\i;{x) -ip{y)\ < C|log|x-y||"'^. 
For both the exponential and polynomial tail situations we have the following 

Theorem 4 (Lyn04). There exists a > such that for all 7 sufficiently large and observables with 
logarithmic modulus of continuity 7, we have €„, = 0{n~°'). 

These general results indicate that the rate of decay of correlations is linked to what is in effect the 
geometrical structure of / as reflected in the tail of the return times for the induced map F. From a 
technical point of view they shift the problem of the statistical properties of / to the problem of the 
geometrical structure of / and thus to the (still highly non-trivial) problem of showing that / admits 
an induced Markov map and of estimating the tail of the return times of this map. The construction 
of an induced map in certain examples is relatively straightforward and essentially canonical but the 
most interesting constructions require statistical arguments to even show that such a map exists and to 
estimate the tail of the return times. In these cases the construction is not canonical and it is usually 
not completely clear to what extent the estimates might depend on the construction. 

We now give a sketch of the proof of Theorem^ The proofs of Theorems|2j|3land|4]are in a similar 
spirit and we refer the interested reader to the original papers. We assume throughout the next few 
sections that F : A ^ A is the Markov induced map associated to f : I ^ I and V^'"^ are the family 
of cylinder sets generated by the Markov partition V = V^^^ of A. We first define a measure on A 
and show in that it is F-invariant, ergodic, and absolutely continuous with respect to Lebesgue. Then 
we define the measure /i on / in terms of u and show that it is /-invariant, ergodic, and absolutely 
continuous. 

2.1. The invariant measure for F. We start with a preliminary result which is a consequence of the 
bounded distortion property. 

2.1.1. The measure of cylinder sets. A straightforward but remarkable consequence of the bounded 
distortion property is that the measure of cylinder sets tends to zero uniformly. 
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Lemma 2.1. 

max{|w(")|;a;(") ^0 as n^O 

Notice that in the one-dimensional case, the measure of an interval coincides with its diameter and 
so this implies in particular that the diameter of cylinder sets tends to zero, implying the essential 
uniqueness of the symbolic representation of itineraries. 

Proof. It is sufficient to show that there exists a constant r S (0, 1) such that for every n > and 
every a;^"^ C oj^'^^^^ we have 



(3) 



Applying this inequality recursively then implies \uj^'^^\ < t\uj^'^ < t'^\uj^"' < • • • < r^l^^l < 
t"|A|. To verify ^ we shall show that 

(4) 1 - rh — h = hrr = ' r-rri > 1 - t- 

To prove @ let first of all 6 = max^^g-p \lo\ < \A\. Then, from the definition of cyUnder sets we have 
that = A and that G P = and therefore < 6 or, equivalently, 

|^n(^(n-l) y ^(n)|)| > |A| - 5 > 0. ThuS, using the bounded distortion property we have 

l^(n-i) \^^(n)| ^ I \ ^ |A|-(5 



and (Hi follows choosing r = 1 - ((|A| - (5)/|A|P). □ 

The next property actually follows only from the conclusions of Lemma ITT] rather than from the 
bounded distortion property itself. It essentially says that it is possible to "zoom in" to any given set 
of positive measure. 

Lemma 2.2. For any e > and any Borel set A with | A| > there exists n > 1 and uj^"'^ E -pC") 
such that 

|Anw(")| > (l-e)|a;(")|. 

Proof. Fix some e > 0. Suppose first of all that A is compact. Then, using the properties of Lebesgue 
measure it is possible to show that for any r/ > there exists an integer n > 1 and a collection 
Tjj = W"'} C V^"^ such that A C U(^^lij(") and \uJr]\ < |^| + r]. Now suppose by contradiction that 



(n) 



n A| < (1 — e) I w'-"'-* I for every lo^^^ € uj^j for any given r/ > 0. Using that fact that the w £ 
are disjoint and thus ^ this implies that 

1^1= ^ n ^1 < (1 - e) < (l-e)(|A| +??). 

Since r/ can be chosen arbitrarily small after fixing s this gives a contradiction. If A is not compact we 
can approximate if from below in measure by compact sets and repeat essentially the same argument. 

□ 
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2.1.2. Absolute continuity. The following estimate also follows immediately from the bounded dis- 
tortion property. It says that the absolute continuity property of F on partition elements is preserved 
up to arbitrary scale with uniform bounds. 

Lemma 2.3. Let A C A and n > I. Then 

< V\A\. 

Proof. The Markov property implies that F~'^{A) is a union of disjoint sets each contained in the 
interior of some element a;^"^ E P^"). Moreover each u;(") is mapped by to A with uniformly 
bounded distortion, thus we have na;('')|/'^^"-' < ^l^l/l^l equivalently, n 

u;(")| < PU||cj(")|/|A|. Therefore 



V\A\ 



E 



\UJ 



(")| =V\A\ 



□ 



2.1.3. The pull-back of a measure. For any n > 1 and Borel A C A, let 

1 " 

MA) = -Y\F-\A)\. 

It is easy to see that Vn is a probability measure on A and absolutely continuous with respect to 
Lebesgue. Moreover, lemma 1231 implies that that the absolute continuity property is uniform in n 
and A in the sense that yn{A) < V \A\ for any A and for any n > 1. By some standard results of 
functional analysis, this implies the following 

Lemma 2.4. There exists a probability measure v and a subsequence {t'ru.} such that, for every 
measurable set A, 

(5) Un,{A)^u{A)<V\A\. 
In particular A is absolutely continuous with respect to Lebesgue. 

2.1.4. Invariance. To show that u is F-invariant, let yl C A be a measurable set. Then, by ^ we 
have 

iy(F-\A)) = lim — y \F-^'-^'^\A)\ 

^ ' k^oo Uk 



lim 

k^oo 



i=0 



±V>-i(^)|_M+l^-"'(A)| 

nk ^ nk nk 

1=0 



Since \A\ and are both uniformly bounded by 1, we have | and [F^"* {A)\/nk 

as A; ^ 0. Therefore 



= lim 

fc— >oo 



Therefore u is F-invariant. 



nk ^ nk nk 

1=0 

Uk-l 



1 '"K 

hm -Y\F-\A)\=,.iA). 

k^oo Uk — 
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2.1.5. Ergodicity and uniqueness. Let A C / be a measurable set with F^^{A) = A and > 0. 
We shall show that v{A) = \A\ = 1. This implies both ergodicity and uniqueness of v. Indeed, if 
v were another such measure invariant absolutely continuous measure, there would be have to be a 
set B with F^^{B) = B and t'(-B) = 1. But in this case we would have also \B\ = 1 and thus 
A = B mod 0. This is impossible since two absolutely continuous invariant measures must have 
disjoint support. 

To prove that |^| = 1, let = A \ ^ denote the complement of A. Notice that x ^ A'^ if and only 
if F{x) e A'^ and therefore F{A^) = A^. By Lemma IT21 for any e > there exists some n > 1 and 
^(n) ^ -p(n) such that |Ana;(")| > (1 -e)|a;(")| and therefore 

Using that fact that F"(a;(")) = / and the invariance of have F"(a;(") n A"") = A"". The bounded 
distortion property then gives 



Since e is arbitrary this implies l^'^l = and thus |^| = 1. 

2.2. The invariant measure for /. We now show how to define a probability measure /i which is 
invariant for the original map / and satisfies all the required properties. 

2.2. 1 . The probability measure ^. We let denote the restriction of u to the partition element lo £ 
V, i.e. for any measurable set ^ C A we have UujiA) = u{A n uj). Then v{A) = ^^^p Vijj{A). 
Then, for any measurable set >1 C M (we no longer restrict our attention to A) we define 

R(w)-1 

Notice that this is a sum of non-negative terms and is uniformly bounded since 

R{u))-1 

[i{A)<fi{M) = Y. E ^M-'{M)) 

loGV j=0 
R{uj)-1 

= E E MM) = Y,R{u;)u{lo)<oc 

by the assumption on the summability of the return times. Thus it defines a finite measure on M and 
from this we define a probability measure by normalizing to get 

M^)=/i(A)//i(M). 

2.2.2. Absolute continuity. The absolute continuity of fi is an almost immediate consequence of the 
definition and the absolute continuity of ^. Indeed, 1^41 = implies z^(vl) = which implies z^t^(vl) = 
for all luj ^V, which therefore implies that we have 

j=0 

and therefore //(A) = 0. 
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2.2.3. Invariance. Recall first of all that by definition f^^'^\uj) = A for any lo eV. Therefore, for 
any A c M v/e have 

f-''(^\A)nu; = F\z\A)nu;, 

where F\~^ denotes the inverse of the restriction F\^ of F to u; (notice that f~^^'^\A) n a; = if 
^ n A = 0). In particular, using the invariance of u under F, this gives 

^ nu;) = ^ i.iF-X{A)nu;) = i^iF-\A)) = i.{A). 

Using this equality we get, for any measurable set AC I, 

R{u>)-1 
uieP j=o 

= E " [(f~HA) n a;) + • • • + n a;)" 

R{uj)-1 

= E E Kr^(^)nu;) + ^(r^H(^)na;) 
uieV j=i oj€V 

= E E HmA)nu)+HA) 

w&V 3=1 

R{U!)-1 

= E E Hn{A)nu;) 

OJ&V j=0 

= H{A). 

2.2 A. Ergodicity and uniqueness. Ergodicity of ji follows immediately from the ergodicity of v since 
every fully invariant set for of positive measure must intersect the image of some partition element 
uj and therefore must have positive (and therefore full) measure for v and therefore must have full 
measure for /x. Notice however that we can only claim a Umited form of uniqueness for the measure 
jjL. Indeed, the support of jjL is given by 

supp{^) = U U f^^^k) 

which is the union of all the images of all partition elements. Then jj. is indeed the unique ergodic 

absolutely continuous invariant measure on this set. However in a completely abstract setting there is 
no way of saying that supp{ii) = M nor that there may not be other relevant measures in M\supp{u). 

2.3. Expansion and distortion estimates. The appUcation of the abstract results discussed above to 

specific examples involves three main steps: 

• Combinatorial construction of the induced map; 

• Verification of the bounded distortion property; 
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• Estimation of the tail of the return times function and verification of the integrability of the 
return times. 

We shall discuss some of these step in some detail in relation to some of the specific case as we go 
through them below. Here we just make a few remarks concerning the bounded distortion property 
and in particular the crucial role played by regularity and derivative conditions in these calculations. 

We begin with a quite general observation which relates the geometric self-similarity condition to 
a property involving the derivative of F. Let F : A — > A be a Markov map which is continuously 
differentiable on each element of the partition V. We let det DF"- denote the determinant of the 
derivative of the map F". 

Definition 11. We say that F has uniformly bounded derivative distortion if there exists a constant 
T> > such that for for all n > 1 and uj £ P^"^ we have 

^. /.r, N , detDF"(x) ^ 

(6) Dzst{r,u;):= max log <V 

x,ye/(") det DF'^ly) 

Notice that this is just the infinitesimal version of the self-similarity bounded distortion property and 
indeed it is possible to show that this condition implies the geometric self-similarity property. In the 
one-dimensional setting and assuming J C w to be an open set, this implication follows immediately 
from the Mean Value Theorem. Indeed, in one dimension the determinant of the derivative is just 
the derivative itself. Thus, the Mean Value Theorem implies that there exists x £ Ilo such that 
\DP{x)\ = \DF"'{uj)\/\uj\ andy e J such that \DF{y)\ = |L»F"( J)|/| J|. Therefore 

.7. |a;| |F"(J)| ^ \F^{J)\/\J\ ^ \DF^iy)\ 

|J| \F^{lo)\/\uj\ \DF^{x)\ - • 

To verify @ we use the chain rule to write 

|detL»F"(x)| _ \ det DF{F'{x))\ \ det DF{F'{x))\ 

°^ I det ^^"(y) I ~ °^ \ det DF{Fi{x))\ ~ °^ \ det DF{F'{y))\' 

Now adding and subtracting | det DF{F^{y))\/\ det DF{F^{y))\ and using that fact that log(l+x) < 
X for x > gives 

\detDF{F\x))\ ^ f \ det DF{F^{x))- det DF{F\y))\ 
°^ \ det DF{Fi{y))\ " °^ V \d^DF{F^y))\ ^ 

I det DF{F^ (x) ) - det DF{F^ {y))\ 
- \ det DF{Fi{y))\ ' 

Therefore we have 

|detL>F"(x)| ^ |detDF(F^(x)) -detDF(F^(y))| 
°^ |detL»F"(y)| " \det DF{F\y))\ 

The inequality ^ gives us the basic tool for verifying the required distortion properties in particular 
examples. 



3. Uniformly Expanding Maps 



In this section we discuss maps which are uniformly expanding. 
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3.1. The smooth/Markov case. We say that / is uniformly expanding if there exist constants C, A > 

such that for all x £ M,allv e T^M, and all n > 0, we have 

||i^/»||>Ce^"||^;||. 
We remark once again that this is a special case of the nonuniform expansivity condition. 

Theorem 5. Let f : M ^ M be uniformly expanding. Then there exists a unique acip n \Ren57\ 
\Gel59\ \Par60\ \Rue68\ \Ave6 8' 'Kr7S7l69"Wat70' 'Las 731. The measure fi is mixing and the correlation 
function decays exponentially fast. [Sin72 Per74 B ow75. Rue76] 

The references given here use a variety of arguments some of which use the remarkable observation 
that uniformly expanding maps are intrinsically Markov in the strong sense given above, with A = M, 
a finite number of partition elements and return time i? = 1 (this is particularly easy to see in the case 
of one-dimensional circle maps / : ^ S^). Thus the main issue here is the verification of the 
distortion condition. 

One way to show this is to show that there is a uniform upper bound independent of n for the 
sum in ^ above. Indeed, notice first of all that the expansivity condition implies in particular that 

1 det DF{F^{y))\ > Ce^"^ > C > for every y, and the regularity condition imphes that det Df 
is Lipschitz: there exists L > such that | det DF{F^{x)) - det DF{F'{y))\ < L|F*(x) - F^{y)\. 
for all x, y G M. Substituting these inequalities into ^ we get 

The next step,and final, step uses the expansivity condition as well as, implicitly, the Markov property 
in a crucial way. Indeed, let diamM denote the diameter of M, i.e. the maximum distance between 
any two points in M. The definition of "P*^") implies that uj is mapped diffeomorphically to M by 
and thus 

\diamM\ > - > Ce^^''-'^\F\x) - F^y)\ 

for every i = 0,...,n — 1. Therefore 

(10) X;V^(x) < ^dmmM^^_,, 

1=0 1=0 i=0 

Substituting back into Q and ^ gives a bound for the distortion which is independent of n. 

The regularity condition on det DF can be weakened somewhat but not completely. There exist 
examples of one-dimensional circle maps / : ^ which are uniformly expanding (and thus 
Markov as above) but for which the uniqueness of the absolutely continuous invariant measure fails 
PQua96'"CamQua01 1 , essentially due to the failure of the bounded distortion calculation. On the other 
hand, the distortion calculation above goes through with minor modifications as long as det DF is just 
Holder continuous. In some situations, such as the one-dimensional Gauss map f{x) = x~^ mod 1 
which is Markov but for which the derivative Df is not even Holder continuous, one can compensate 
by taking advantage of the large derivative. Then it is possible to show directly that the right hand side 
of ^ is uniformly bounded, even though ^ does not hold. 

3.2. The non-Markov case. The general (non-Markov) piecewise expanding case is significantly 
more complicated and even the existence of an absolutely continuous invariant measure is no longer 
guaranteed K.asYor73 GorSch89 Oua99^, TsuOOa B uzOlal . One possible problem is that the images 
of the discontinuity set can be very badly distributed and cause havoc with any kind of structure. In the 
Markov case this does not happen because the set of discontinuities gets mapped to itself by definition. 
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Also the possibility of components being translated in different directions can destroy on a global 
level the local expansiveness given by the derivative. Moreover, where results exist for rates of decay 
of correlations, they do not always apply to the case of Holder continuous observables, as technical 
reasons sometimes require that different functions spaces be considered which are more compatible 
with the discontinuous nature of the maps. We shall not explicitly comment on the particular classes 
of observables considered in each case. 

In the one-dimensional case these problems are somewhat more controllable and relatively simple 
conditions guaranteeing the existence of an ergodic invariant probability measure can be formulated 
even in the case of a countable number of domains of smoothness of the map. These essentially 
require that the size of the image of all domains on which the map is be strictly positive and 
that certain conditions on the second derivative are satisfied lLasYor73IIAdl73 1IBow77 Bow79 1. In 
the higher dimensional case, the situation is considerably more complicated and there are a variety 
of possible conditions which can be assumed on the discontinuities. The conditions of r LasYor73t 
were generalized to the two-dimensional context in |Kel79| and then to arbitrary dimensions in 
|GorBoy8 9 . BuzOOa TsuOlb |. There are also several other papers which prove similar results un- 
der various conditions, we mention IAlvOOIIBuzKel01llBuzPacSchOIllBuzOTcllSauOOIIBuzSar03l . In 
rBuz99a"Cow02 ] it is shown that conditions sufficient for the existence of a measure are generic in a 
certain sense within the class of piecewise expanding maps. 

Estimates for the decay of correlations have been proved for non-Markov piecewise smooth maps, 
although again the techniques have had to be considerably generalized. In terms of setting up the basic 
arguments and techniques, a similar role to that played by |LasYor73 1 for the existence of absolutely 
continuous invariant measures can be attributed to | Kel80 HofKel82 , R yc83J for the problem of de- 
cay of correlations in the one-dimensional context. More recently, alternative approaches have been 
proposed and implemented in ['Liv95a"Liv95''You98 1. The approach of [You981 has proved particu- 
larly suitable for handling some higher dimensional cases such as IBu zMau02 l in which assumptions 
on the discontinuity set are formulated in terms of topological pressure and IAlvLuzPinllGou04l in 
which they are formulated as geometrical non-degeneracy assumptions and dynamical assumptions 
on the rate of recurrence of typical points to the discontinuities. The construction of an induced 
Markov map is combined in lDia04| with the Theorem|4]to obtain estimates for the decay of correla- 
tions of non-Holder observables for Lorenz-hke expanding maps. We remark also that the results of 
lAlvLuz Pin , Gou04 1 apply to more general piecewise nonuniformly expanding maps, see section|6l It 
would be interesting to understand the relation between the assumptions of LAlvLuzPiniiGou04 1 and 
those of IBuzMau02 l . 

4. Almost Uniformly Expanding Maps 

Perhaps the simplest way to relax the uniform expansivity condition is to allow some fixed (or 
periodic) point p to have a neutral eigenvalue, e.g in the one-dimensional setting \Df{p)\ = 1, while 
still requiring all other vectors in all directions over the tangent spaces of all points to be strictly 
expanded by the action of the derivative (though of course not uniformly since the expansion must 
degenerate near the point p). Remarkably this can have extremely dramatic consequences on the 
dynamics. 

There are some recent results for higher-dimensional systems IPolYurO 1 allHuO I IIGou03ll but a more 
complete picture is available the one-dimensional setting and thus we concentrate on this case. An 
initial motivation for these kinds of examples arose from the concept of intermittency in fluid dynam- 
ics. A class of one-dimensional maps expanding everywhere except at a fixed point was introduced by 
Maneville and Pomeau in IManPom80.l as a model of intermittency since numerical studies showed 
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that orbits tend to spend a long time trapped in a neighbourhood of the fixed point with relatively 
short bursts of chaotic activity outside this neighbourhood. Recent work shows that indeed, these long 
periods of inactivity near the fixed point are a key to slowing down the mixing process and obtaining 
examples of systems with subexponential decay of correlations. 

We shall consider interval maps / which are piecewise with a extension to the boundaries 
of the domains and for which the derivative is strictly greater than 1 everywhere except at a fixed 
point p (which for simplicity we can assume lies at the origin) where Df{p) = 1. For definiteness, let 
us suppose that on a small neighbourhood of the map takes the form 

f{x) X + x'^(f){x) 

where means that the terms on the two sides of the expression as well as their first and second order 
derivatives converge as x ^ 0. We assume moreover that (p is C°° for x ^ 0; the precise form of 
(/) determines the precise degree of neutrality of the fixed point, and in particular affects the second 
derivative D^f. It turn out that it plays a crucial role in determining the mixing properties and even 
the very existence of an absolutely continuous invariant measure. For the moment we assume also a 
strong Markov property: each domain of regularity of / is mapped bijectively to the whole interval. 
The following result shows that the situation can be drastically different from the uniformly expanding 
case. 

Theorem 6. HPiaSOV If f is at p (e.g. 4'{x) = 1) then f does not admit any acip. 

Note that / has the same topological behaviour as a uniformly expanding map, typical orbits con- 
tinue to wander densely on the whole interval, but the proportion of time which they spend in various 
regions tends to concentrate on the fixed point, so that, asymptotically, typical orbits spend all their 
time near 0. It turns out that in this situation there exists an infinite (cr-finite) absolutely continuous 
invariant measure which gives finite mass to any set not containing the fixed point and infinite mass 
to any neighbourhood of p [Tha83 1. 

The situation changes if we relax the condition that / be at p and allow the second derivative 
D"^ f{x) to diverge to infinity as x ^ p. This means that the derivative increases quickly as one moves 
away from p and thus nearby points are repelled at a faster rate. This is a very subtle change but it 
makes all the difference. 

Theorem 1. If cj) is of the form 4'{x) = x~" for some a G (0, 1), then APia80\\HuYou95V f admits 
an ergodic acip fi and \Iso99\\LivSauVai98\Wou^[PolYur01a\^arU2muU4\\GouU4aV n is mixing with 
decay of correlations 

If (j){l/x) = log xlog^"^^ x .. .log^^~^^ x{log^^^ xy^" for some r > 1, a G ( — 1,cxd) where log^^^ = 
log log . . . log repeated r times, then llHol04\l f admits a mixing acip fi with decay of correlations 

C„ = 0(logW n)-. 

Thus, the existence of an absolutely continuous invariant measure as in the uniformly expanding 
case has been recovered, but the exponential rate of decay of correlation has not. We can think of the 
indifferent fixed point as having the effect of slowing down this process by trapping nearby points for 
disproportionately long time. The estimates in I Sar02 , H uQ4 . GouQ4a I include lower bounds as well 
as upper bounds. The approach in lfYb u99 1 using Markov induced maps applies also to non-Markov 
cases and IIHol04 1 can also be generalized to these cases. 

The proofs of Theorem Q do not use directly the fact that / is nonuniformly expanding. Indeed 
the fact that / is nonuniformly expanding does not follow automatically from the fact that the map is 
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expanding away from the fixed point p. However we can use the existence of the acip to show that this 
condition is satisfied. Indeed by Birkhoff 's Ergodic Theorem, typical points spend a large proportion 
of time near p but also a positive proportion of time in the remaining part of the space. More formally, 
by a simple application of Birkhoff 's Ergodic Theorem to the function log \ Df{x)\, we have that, for 
^-almost every x, 



The fact that f log \ Df\dfi > follows from the simple observation that fi is absolutely continuous, 
finite, and that log \ Df\ > except at the neutral fixed point. 



We now consider another class of systems which can also exhibit various rates of decay of corre- 
lations, but where the mechanism for producing these different rates is significantly more subtle. The 
most general set-up is that of a piecewise smooth one-dimensional map f : I ^ I with some finite 
set C of critical/singular points at which Df = or Df = ±oo and/or at which / may be discontin- 
uous. There are at least two ways to quantify the "uniformity: of the expansivity of / in ways that get 
reflected in different rates of decay of correlations: 

• To consider the rate of growth of the derivatives along the orbits of the critical points; 

• To consider the average rate of growth of the derivative along typical orbits. 

In this section we will concentrate on the first, somewhat more concrete, approach and describe the 
main results which have been obtained over the last 20/25 years. We shall focus specifically on the 
smooth case since this is where most results have been obtained. Some partial generalization to the 
piecewise smooth case can be found in IDia HolLuz04i . The second approach is somewhat more 
abstract but also more general since it extends naturally to the higher dimensional context where 
critical points are not so well defined and/or cannot play such a fundamental role. The main results 
in this direction will be described in Section |6l below in the framework of a general theory of non- 
uniformly expanding maps. 

5. 1. Unimodal maps. We consider the class of interval maps f : I ^ I with some finite set C of 
non-flat critical points. We recall that c is a critical point if Df{c) = 0; the critical point is non-flat 
if there exists an < ^ < cxd called the order of the critical point, such that \Df{x)\ « |rE — c|^~^ 
for x near c; / is unimodal if it has only one critical point, and multimodal if it has more than 
one. Several results to be mentioned below have been proved under a standard technical negative 
Schwarzian derivative condition which is a kind of convexity assumption on the derivative of /, see 
IMeIStr88l for details. Recent results [KozOO] indicate that this condition is often superfluous and 
thus we will not mention it explicitly. 

The first result on the statistical properties of such maps goes back to Ulam and von Neumann 
IUla Neu47 1 who showed that the top unimodal quadratic map, /(x) = — 2 has an acip. Notice 
that this map is actually a Markov map but does not satisfy the bounded distortion condition due to 
the presence of the critical point. It is possible to construct an induced Markov map for / which does 
satisfy this condition and gives the result, but Ulam and von Neumann used a very direct approach, 
observing that / is conjugate to a piecewise-linear uniformly expanding Markov map for which 
Lebesgue measure is invariant and ergodic. This implies that the pull-back of the Lebesgue measure 
by the conjugacy is an acip for /. However, the existence of a smooth conjugacy is extremely rare 
and such an approach is not particularly effective in general. 
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More general and more powerful approaches have allowed the existence of an acip to be proved 
under increasingly general assumptions on the behaviour of the critical point. Let 

D^{c) = \Dnf{c))\. 

Notice that the derivative along the critical orbit needs to be calculated starting from the critical value 
and not from the critical point itself for otherwise it would be identically 0. 

Theorem 8. Let f : I ^ I be a unimodal map. Then f admits an ergodic acip if the following 
conditions hold (each condition is implied by the preceding ones): 

• The critical point is pre-periodic fRueTTS; 

• The critical point is non-recurrent IMi s^TS : 

• Dn oo exponentially fast hColEckE3\[NowStr88^ : 

• Dn — > OO sufficiently fast so that d}/^ < oo \NowStr9n : 

• Dn — > OO HBruSheStrOSV . 

If Dn — > OO exponentially fast then some power of f is mixing and exhibits exponential decay of cor- 
relations \KelNow92^ ( and \You92^ with additional bounded recurrence assumptions on the critical 
point). 

Notice that the condition of ['BruSheStr03J is extremely weak. In fact they show that it is suffi- 
cient for Dn to be eventually bounded below by some constant depending only on the order of the 
critical point. However even this condition is not optimal as there are examples of maps for which 
lim inf Dn = but which still admit an ergodic acip. It would be interesting to know whether an opti- 
mal condition is even theoretically possible: it is conceivable that a complete characterization of maps 
admitting acip's in terms of the behaviour of the critical point is not possible because other subtleties 
come into play. 

5.2. Multimodal maps. Given the number of people and research papers in one-dimensional dy- 
namics it is remarkable that until very recently there were essentially no results at all on the existence 
of acip'?, for multimodal maps. A significant breakthrough was achieved by implementing the strat- 
egy of constructing induced Markov maps and estimating the rate of decay of the tail. This strategy 
yields also estimates for various rates of decay in the unimodal case and extends very naturally to the 
multimodal case. 

Theorem 9 (IB ruLuzStr03l ). Let f by a multimodal map with a finite set of critical points of order i 
and suppose that 

^Z?-V(2^-i)<oo 

n 

for each critical point c. Then there exists an ergodic acip jjL for f. Moreover some power of f is 
mixing and the correlation function decays at the following rates: 

Polynomial case: If there exists C > 0,t > 2i — 1 such that 

Dn{c) > Cn\ 

for all c ^ C and n > 1, then, for any f < — 1, we have 

Cn = 0{n-^) 

Exponential case: If there exist C, /3 > such that 

Dn{c) > Ce^" 
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for all c ^ C and n > 1, then there exist /? > such that 

These results gives previously unknown estimates for the decay of correlations even for unimodal 
maps in the quadratic family. For example they imply that the so-called Fibonacci maps p^yuM il93 1 
exhibit decay of correlation at rates which are faster than any polynomial. It seems likely that these 
estimates are essentially optimal although the argument only provides upper bounds. The general 
framework of |Lyn04| applies to these cases to provide estimates for the decay of correlation for 
observable which satisfy weaker than Holder conditions on the modulus of continuity. The condition 
for the existence of an acip have recently been weakened to the summability condition ^ Dn < oo 
and to allow the possibility of critical points of different orders I BruStrOl l . Some improved technical 
expansion estimates have been also obtained in |Ced04| which allow the results on the decay of 
correlations to apply to maps with critical points of different orders. 

Based on these results, the conceptual picture of the causes of slow rates of decay of correlations 
appears much more similar to the case of maps with indifferent fixed points than would appear at first 
sight: we can think of the case in which the rate of growth of Dn is subexponential as a situation in 
which the critical orbit is neutral or indifferent and points which land close to the critical point tend 
to remain close to ("trapped" by) its orbit for a particularly long time. During this time orbits are 
behaving "non-generically" and are not distributing themselves over the whole space as uniformly as 
they should. Thus the mixing process is delayed and the rate of decay of correlations is correspond- 
ingly slower. When D„ grows exponentially, the critical orbit can be thought of (and indeed is) a 
non-periodic hyperbolic repelling orbit and nearby points are pushed away exponentially fast. Thus 
there is no significant loss in the rate of mixing, and the decay of correlations is not significantly 
slowed down notwithstanding the presence of a critical point. 

5.3. Benedicks-Carleson maps. We give here a sketch of the construction of the induced Markov 
map for a class of unimodal maps. We shall try to give a conceptually clear description of the main 
steps and ingredients required in the construction. The details of the argument are unfortunately 
particularly technical and a lot of notation and calculations are carried out only to formally verify 
statements which are intuitively obvious. It is very difficult therefore to be at one and the same time 
conceptually clear and technically honest. We shall therefore concentrate here on the former approach 
and make some remarks about the technical details which we omit or present in a simphfied form. 
Let 

fa{x) = - a 

for X G / = [-2, 2] and 

for some e > sufficiently small. The assumptions and the details of the proof require the introduction 
of several additional constants, some intrinsic to the maps under consideration and some auxiliary for 
the purposes of the argument. In particular we suppose that there is a A G (0, log 2) and constants 

A>a>(5>J>0 

where x ^ y means that y must be sufficiently small relative to x. Finally, to simplify the notation 
we also let /? = a/ A. We restrict ourselves to parameter values a G Jig which satisfy the Benedicks- 
Carleson conditions: 

Hyperbolicity: There exist C > such that 

£'n>Ce^" Vn>l; 
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Slow recurrence: 

|cn|>e""" Vn>l. 

In section Elon page|^we sketch a proof of the fact that these conditions are satisfied for a positive 
measure set of parameters in (for any A G (0, log 2) and any a > 0). They are therefore reasonably 
generic conditions. Assuming them here will allow us to present in a compact form an almost complete 
proof. During the discussion we shall make some comments about how the argument can be modified 
to deal with slower rates of growth of D„ and arbitrary recurrence patterns of the critical orbit. 

We remark that the overall strategy as well as several details of the construction in the two argu- 
ments (one proving that the hyperbolicity and slow recurrence conditions occur with positive prob- 
ability and the other proving that they imply the existence of an acip) are remarkably similar. This 
suggests a deeper, yet to be fully understood and exploited, relationship between the structure of 
dynamical space and that of parameter space. 

We let 

A = {5,5) c (-(5,(5) = A 
denote 5 and 5 neighbourhood of this critical point c. The aim is to construct a Markov induced map 

F : A ^ A. 

We shall do this in three steps. We first define an induced map f^-.A^I which is essentially based 
on the time during which points in A shadow the critical orbit. The shadowing time p is piecewise 
constant on a countable partition of A but the images of partition elements can be arbitrarily small. 
Then we define an induced map : A I which is still not Markov but has the property that 
the images of partition elements are uniformly large. Finally we define the Markov induced map 
: A ^ A as required. 

5.4. Expansion outside A. Before starting the construction of the induced maps, we state a lemma 
which gives some derivative expansion estimates outside the critical neighbourhood A. 

Lemma 5.1. There exists a constant C > independent of 5 such that for e > sufficiently small, 
all a G Q.£, f = fa,x £ I and n > 1 such that x, f(x), .., ^ Awe have 

|Z)/"(x)| > (5e^" 

and if, moreover, /"(x) € A and/or x £ /(A) then 

|LT(x)| > C7e^" 

Notice that the constant C and the exponent A do not depend on 6 or 6. This allows us to choose 6 
and 5 small in the following ai^gument without worrying about this affecting the expansivity estimates 
given here. In general of course both the constants C and A depend on the size of this neighbourhood 
and it is an extremely useful feature of this particular range of parameter values that they do not. In 
the context of the quadratic family these estimates can be proved directly using the smooth conjugacy 
of the top map /2 with the piecewise affine tent map, see IUlaNeu47 1 ILuzOOl . However there are 
general theorems in one-dimensional dynamics to the effect that one has uniform expansivity outside 
an arbitrary neighbourhood of the critical point under extremely mild conditions I Man85J and this is 
sufficient to treat the general case in [.BruLuzStr03J . 
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5.5. Shadowing the critical orbit. We start by defining a partition of tlie critical neighbourhoods 

A and A. For any integer r > 1 let = [e^^i e^^^^) and I^r = (— e"**^^, — e"*"] and, for each 
r > + 1, let = Ir-i U Jr U We can suppose without loss of generality that rs = log(5~^ 
and = log are integers Then 

A = {0} U U 1^0, and A = {0} U |J 

|r|>r^+l k|>r^+l 

This is one of the minor technical points of which we do not give a completely accurate description. 
Strictly speaking, the distortion estimates to be given below require a further subdivision of each 
into subintervals of equal length. This does not affect significantly any of the other estimates. A 
similar partition is defined in Section l7.1.2l on page|34]in somewhat more detail. We remark also that 
the need for the two neighbourhoods A and A will not become apparent in the following sketch of the 
argument. We mention it however because it is a crucial technical detail: the region A \ A acts as a 
buffer zone in which we can choose to apply the derivative estimates of Lemma ISlTI or the shadowing 
argument of Lemma according to which one is more convenient in a particular situation. 
Now let 

p{r) = max{k : \f+\x) - f-^Hc)\ < e"^"^' V x G Vj < A;} 
This definition was essentially first formulated in [BenCarSS | and |BenCar91 1. The key characteristic 
is that it guarantees a bounded distortion property which in turn allows us to make several estimates 
based on information about the derivative growth along the critical orbit. Notice that the definition in 
terms of a is based crucially on the fact that the critical orbit satisfies the slow recurrence condition. 
We mention below how this definition can be generalized. 

Lemma 5.2. For all points x £ Ir and p = p{r) we have 

\Dr+\x)\ = \Df{x)\ ■ \Dr{xo)\ > e('-'^> 

Recall that f3 = a/X can be chosen arbitrarily small. 

Proof. First of all, using the bounded recurrence condition, the definition of the binding period, and 
arguing as in the distortion estimates for the uniformly expanding maps above, it is not difficult to 
show that there is a constant Vi, depending on a but independent of r and 6, such that for all xq, yo £ 

f{ir ) and 1 < fc < p. 



(11) 



Using the definition of p this implies 

g-2a{p-i) > |^^_^ _ ^^_^| > v^i\DfP-\co)\ |xo - col > 

and thus I?ie~^"^e^" > e^^e~'^e~'^^' . Rearranging gives 

, ^ ^ logPi + 2a + 2A + 2r+ ^ 3r 

^'^^ P^'- aT^^ -T 

as long as we choose 6 so that rs is sufficiently large in comparison to the other constants, none of 
which depend on 6. Moreover 

Pe-2ni?r(xo)| > V\xo - col \Dr{xo)\ > \xp - cp\ > e-2°P 

and therefore, using we have \DfP{xo)\ > V"^ e'^'' e'^'^P > V~^e^'^'^> . Since x £ U 
we have |L>/(x)| = 2|x - c| > 2e~(^+2) and therefore \DfP+^{x)\ = \DfP{xQ)\ \Df{x)\ > 
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e ^e^"^ 6/3)rg ^' > e ^2? ^e^^ ^^^'^ . This implies the result as long as we choose large enough. 

□ 

Thus we have a first induced map Fp : A ^ I given by Fp{x) = fP^^\x) where p{x) = p{r) for 
X G I±r which is uniformly expanding. Indeed notice that Df^^^\x) — > cxd as x ^ c. However there 
is no reason for which this map should satisfy the Markov property and indeed, an easy calculation 
shows that the images of the partition elements are ~ e^'''^^ —>■ and thus not even of uniform size. 

The notion of shadowing can be generalized without any assumptions on the recurrence of the crit- 
ical orbit in the following way, see FBruLu zStrOBI : let {7n} be a monotonically decreasing sequence 
with 1 > 7„ > and ^ 7„ < oo. Then for x £ A, let 

pix) := max{p : \f'ix) - /^c)| < 7fc|/'(c) - c| V A: < p - 1}. 

A simple variation of the distortion calculation used above shows that the summability of 7„ implies 
that (fTTT l holds with this definition also. Analogous bounds on Dn will reflect the rate growth of 
the derivative along the critical orbit. If the growth of Dn is subexponential, the binding period will 
last much longer because the interval \f^{x) — f^{c)\ is growing at a slower rate. The generality of 
the definition means that it is more natural to define a partition Ip as the "level sets" of the function 
p{x).The drawback is that we have much less control over the precise size of these intervals and their 
distance from the critical point. Some estimates of the tail {x > p} can be obtained and it turns out 
these these are closely related to the rate of growth of Dn and to those of the return time function for 
the final induced Markov map. This is because the additional two steps, the escape time and the return 
time occur exponentially fast. Thus the only bottleneck is the delay caused by the long shadowing of 
the critical orbit. 

5.6. The escape partition. Now let J C / be an arbitrary interval (which could also be A itself). 
We want to construct a partition V of J and a stopping time £' : J ^ INJ constant on elements of V 
with the property that for each to G V, f^^^\uj) 5. We think of 5 as being our definition of large 
scale; we call E{uj) the escape time of uj, we call the interval f^^^\uj) and escape interval, and call 
V the escape time partition of J. 

The construction is carried out inductively in the following way. Let k > 1 and suppose that the 
intervals with E < k have already been defined. Let be a connected component of the complement 
of the set {E < k} C J. We consider the various cases depending on the position of f^{oj). If /^(w) 
contains A U U I-r^ then we subdivide io into three subintervals satisfying the required properties, 
and letE = k on each of them. If fiuj) n A = we do nothing. If f{uj) n A / but f{uj) 
does not intersect more than two adjacent /,.'s then we say that k is an inessential return and define 
the corresponding return depth by r = max{|r| : /^(w) n Ir 7^ 0}- If f'^i^^) n A 7^ and f^{uj) 
intersects at least three elements of T, then we simply subdivide u) into subintervals in such a way 
that each cj^ satisfies C f^{oJr) C Ir - For r > we say that ojr has an essential return at time k 
with an associated return depth r. For all other r, the ujr are escape intervals, and for these intervals 
we set E = k. Finally we consider one more important case. If f^^^^ contains A as well as (at least) 
the two adjacent partition elements then we subdivide as described above except for the fact that we 
keep together that portion of uj which maps exactly to A. We let this belong to the escape partition V 
but, as we shall see, we also let it belong to the final partition associated to the full induced Markov 
map. 

This purely combinatorial algorithm is designed to achieve two things, neither one of which follows 
immediatly from the construction: 

(1) Guarantee uniformly bounded distortion on each partition element up to the escape time; 
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(2) Guarantee that almost every point eventually belongs to the interior of a partition element 

We shall not enter into the details of the distortion estimates here but discuss the strategy for shomwing 
that is a partition mod of J. Indeed this follows from a much stronger estimate concerning the 
tail of the escape time function: there exists a constant 7 > such that for any interval J with | J| > (5 
we have 

(13) \{xeJ:E{x)>n]\= k'l < e"^"! J|. 

w'&V 
E{uj')>n 

The argument for proving (fT3l revolves fundamentally around the combinatorial information de- 
fined in the construction. More specifically, for G "P let ri, r2, . . . , denote the sequence of return 
depths associated to essential return times occurring before E{u>'), and let £{u}) = X^^^o^*- Notice 
that this sequence may be empty if uj escapes without intersecting A, in this case we set £{uj) = 0. 
We now split the proof into three steps: 

1 ) Relation between escape time and return depths. The first observation is that the escape time is 
bound by a constant multiple of the sum of the return depths: there exists a k depending only on A 
such that 

(14) E{uj)<k8{lo). 

Notice that a constant Tq should also be added to take care of the case in which £{uj) = 0, correspond- 
ing to the situation in which u has an escape the first time that iterates of uj intersect A. Since u) is 
an escape, it has a minimum size and the exponential growth outside A gives a uniform bound for the 
maximum number of iterates within which such a return must occur. Since this constant is uniform 
it does not play a significant role and we do not add it explicitly to simplify the notation. For the 
situation in which £{uj) > it is sufficient to show that each essential return with return depth r has 
the next essential return or escape within at most nr iterations. Again this follow from the observation 
that the derivative is growing exponentially on average during all these iterations: we have exponential 
growth outside A and also exponential growth on average during each complete inessential binding 
period. This implie (fT4l i. From (fT4t we then have 

(15) Ei^i< E 1^1- 

E(ijj)>n £{ui)>n/K 

Thus it is enough to estimate the right hand side of (fTSl which is saying that there is an exponentially 
small probability of having a large total accumulated return depth before escaping, i.e. most intevals 
escape after relatively few and shallow return depths. The strategy is perfectly naive and consists of 
showing that the size of interval with a certain return depth £ is exponentially small in £ and that there 
cannot be too many so that their total sum is still exponentially small. 

2) Relation between size of uj and return depth. The size of each partition element can be estimated 
in terms of the essential return depths in a very coarse, i.e. non-sharp, way which is nevertheless 
sufficient for your purposes. The argument relies on the following observation. Every return depth 
corresponds to a return which is followed by a binding period. During this binding period there 
is a certain overall growth of the derivative. During the remaining iterates there is also derivative 
growth, either from being outside A or from the binding period associated to some inessential return. 
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Therefore a simple application of the Mean Value Theorem gives 

(16) |a;| <e-^^^('"). 

3) The cardinality of the lu with a certain return depth. It therefore remains only to estimate the 
cardinality of of the set of elements uj which can have the same value of 8. To do this, notice first 
of all that we have a bounded multiplicity of elements of V{u>) which can share exactly the same 
sequence of return depths. More precisely this corresponds to the number of escaping intervals which 
can arise at any given time from the subdivision procedure described above, and is therefore less than 
r^. Moroever, every return depth is bigger than and therefore for a given sequence ri . . . , we 
must have s < E jr^. Therefore letting t] = rj^ , choosing 6 sufficiently small the result follows from 
the following fact: Let Nf^^g denote the number of sequences (ti, . . . , ts), U > I for alH, 1 < i < s, 
such that ^i^i ti = k. Then, for all r) > 0, there exists 77 > such that for any integers s, k with 
s < rikwe have 

(17) Nk,s < e*^'. 

Indeed, applying (I17t we get that the total number of possible sequences is Nj. = Yll=i ^£,s ^ 
r]£e^^ < e^^^ . Taking into account the multiplicity of the number of elements sharing the same 
sequence we get the bound on this quantity as < r-^e^^^ < e^^^ . Multiplying this by (fT6b and substi- 
tuting into (fTSl gives the result. 

To prove (I17t . notice first of all that Ni^^g can be bounded above by the number of ways to choose 
s balls from a row of k + s balls, thus partitioning the remaining k balls into at most s + 1 disjoint 
subsets. Notice also that this expression is monotonically increasing in s, and therefore 



s J \ k J - \ k J {-qky.kl ' 

Using Stirling's approximation formula A;! G [1, 1 + ^]\/27rA;fc^e~^, we have A^'^^^ < ^^^^'^l^tlfc^fc''^ — 

(1 + r7)(^+''')'^77~^'^ < exp{(l + r/)fclog(l + ??) — rfklogr]} < exp{((l + 77)7/ — r]\ogri)k}. Clearly 
(1 + ri)r] — -q log r/ ^ as 77 ^ 0. This completes the proof of ( fT^ . 

5.7. The return partition. Finally we need to construct the full induced Markov map. To do this 
we simply start with A and construct the escape partition V of A. Notice that this is a refinement of 
the binding partition into intervals Ir- Notice also that the definition of this escape partition allows 
as a special case the possibility that f^^^\uj) = A. In this case of course uj satisfies exactly the 
required properties and we let it belong by definition to the partition Q and define the return time of u 
as R{uj) = E{ijj). Otherwise we consider each escape interval J = f^^^\uj) and use it as a starting 
interval for constructing and escape partition and escape time function. Again some of the partition 
elements constructed in this way will actually have returns to A. These we define to belong to Q and 
let their return time be the sum of the two escape times, i.e. the total number of iterations since they 
left A, so that /^('^) (u>) = A. For those that don't return to A we repeat the procedure. We claim 
that almost every point of A eventually belongs to an element which returns to A in a good (Markov) 
way at some point and that the tail estimates for the return time function are not significantly affected, 
i.e. they are still exponential. 

The final calculation to support this claim is based on the following fairly intuitive observation. 
Once an interval lo has an escape, it has reached large scale and therefore it will certainly cover A 
after some uniformly bounded number of iterations. In particular it contains some subinterval u C lo 
which has a return to 6 with at most this uniformly bounded number of iterates after the escape. 
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Moreover, and crucially, the proportion of (Zi in cj is uniformly bounded below, i.e. there exists a 
constant ^ > independent of lo such that 

(18) \u;\>(,\uj\. 

Using this fact we are now ready to estimate the tail of return times, |{a; G Q \ R{uj) > n}]. The 
argument is again based on taking into account some combinatorial information related to the itinerary 
of elements of the final partition Q. In particular we shall keep track of the number of escape times 
which occur before time n for all elements whose return is greater than n. First of all we let 

(19) Q(") = {uj£ Q\R{u:) > n}. 
Then, for each 1 < i < n we let 

(20) ) = {u;G Q(")|i^i_i(a;) <n< E,{u;)} 

be the set of partition elements in Q(") who have exactly i escapes. Amongst those we distinguish 
those with a specific escape combinatorics. More precisely, for {ti,. . . ,ti) such that tj > 1 and 
^ tj = n, let 

k 

e Q't^\J2^j = Ek{uj)A<k<i-i 



(21) Q^\h, ...,ti) = { 

We then fix some small 77 > to be determined below and write 

(22) |{^€Q|i?H>n}| = ^|Qf)|=5]|Qf)|+ ^ \Q^\ 

i<n i<r]n rjn<i<n 

By (dl we have |Qf ^| < (1 - CY, which gives 

(23) Yl isS"^!^ E (i-o*<(i-er ^e-^«- 

rin<i<n r)n<i<n 

for some 7^ > 0. Now let a; C u) G Ve^ be one of the non-returning parts of an interval u) that had its 
i'th escape at time Ei. Note that 

Therefore 



(24) 



E,+i{uj')>Ei+n 

where 7 is as in (fT3l . Let Qf^^ denote the set of intervals to £ Q that have precisely i escapes before 
time n then 

(25) Y kl<e-^"|A|. 

ijeQ(^){Ei,...,E,) 

Therefore using again the combinatorial counting argument and the inequality dTTl we get 

^2g^ i<m i<iin{ti,...,U) 

i<rin 
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Recall that by (ITTl fj can be chosen arbitrarily small by choosing tj small. Thus, combining and 
d26b and substituting into (l22l) we get 



The intuitive picture which emerges from the examples discussed above is that of a default expo- 
nential mixing rate for uniformly expanding systems and Holder continuous observables. However it 
is clear that general nonuniformly expanding systems can exhibit a variety of rates of decay. Some- 
times these rates can be linked to properties of specific neutral orbits which can slow down the mixing 
process. However it is natural to ask whether there is some intrinsic information related to the very 
definition of nonuniform expansivity which determines the rate of decay of correlation. We recall that 
/ is nonuniformly expanding if there exists A > such that for almost every x G M 



Although the constant A > is uniform for Lebesgue almost every point, the convergence to the 
lim inf is not generally uniform. 

A measure of nonuniformity has been proposed in I AlvLuzPin03 'l based based precisely on the 
idea of quantifying the rate of convergence. The measure has been shown to be directly linked to the 
rate of decay of correlations in IIAlvLuzPindim 1 1 in the one-dimensional setting and in IIAlvLu zPin | 
in arbitrary dimensions, in the case of polynomial rates of decay. Recently the theory has been very 
extended to cover the exponential case as well IIGou04l . We give here the precise statements. 

6. 1. Measuring the degree of nonuniformity. 

6. 1. 1. The critical set. Let / : M ^ M be a (piecewise) map. For x G M we let Dfx denote the 
derivative of / at x and define = max{||D/a;(i;)|| : v G T^M, \\v\\ = 1}. We suppose that / 

fails to be a local diffeomorphism on some zero measure critical set C at which / may be discontinuous 
and/or Df may be discontinuous and/or singular and/or blow up to infinity. Remarkably, all these 
cases can be treated in a unified way as problematic points as will be seen below. In particular we 
can define a natural generalization of the non-degeneracy (non-flatness) condition for critical points 
of one-dimensional maps. 

Definition 12. The critical set C C M is non-degenerate if m(C) = and there is a constant f3 > 
such that for every x G Af \ C we have dist(x,C)^ < llD/a-ull/lluH < dist(2;,C)~^ for all v G 
TxM, and the functions logdetD/ and log HD/^^H are locally Lipschitz. with Lipschitz constant 
<dist(x,C)-^. 

From now on we shall always assume these non-degeneracy conditions. We remark that the results 
to be stated below are non-trivial even when the critical set C is empty and / is a local diffeomorphism 
everywhere. For simplicity we suppose also that / is topologically transitive, i.e. there exists a point 
X whose orbit is dense in M. Without the topological transitivity condition we would just get that the 
measure fi admits a finite number of ergodic components and the results to be given below would then 
apply to each of its components. 



\{io G QlRiio) > n}\ < e-^«" + e^'^"^)" < e 
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6. 1 .2. Expansion and recurrence time functions. Since we have no geometrical information about 

/ we want to show that the statistical properties such as the rate of decay of correlations somehow 
depends on abstract information related to the non-uniform expansivity condition only. Thus we make 
the following 

Definition 13. For x G M, we define the expansion time function 

{ 1 

£{x) = min ^ iV : - ^ log \\Df-\^^ > A/2 Vn > iV 

By condition (*) this function is defined and finite almost everywhere. It measures the amount of 
time one has to wait before the uniform exponential growth of the derivative kicks in. If £{x) was 
uniformly bounded, we would essentially be in the uniformly expanding case. In general it will take 
on arbitrarily large values and not be defined everywhere. If £{x) is large only on a small set of 
points, then it makes sense to think of the map as being not very non-uniform, whereas, if it is large 
on a large set of points it is in some sense, very non-uniform. We remark that the choice of A/2 in the 
definition of the expansion time function £{x) is fairly arbitrary and does not affect the asymptotic 
rate estimates. Any positive number smaller than A would yield the same results. 

We also need to assume some dynamical conditions concerning the rate of recurrence of typical 
points near the critical set. We let ds{x,C) denote the 6-truncated distance from x to C defined as 
ds{x,C) = d{x,C) if d{x,C) < S and ds{x,C) = 1 otherwise. 

Definition 14. We say that / satisfies the property of subexponential recurrence to the critical set if 
for any e > there exists S > such that for Lebesgue almost every x G M 

^ n— 1 

(**) lim sup — — log distg (/-^ ix),C) < e. 

n— »4-oo IT' . 

We remark that although condition (**) might appear to be a very technical condition, it is actually 
quite natural and in fact almost necessary. Indeed, suppose that an absolutely continuous invariant 
measure fi did exist for /. Then, a simple apphcation of Birkhoff 's Ergodic theorem implies that 
condition (**) is equivalent to the integrability condition 

logdists{x,S)\diJ, < oo 

which is simply saying that the invariant measure does not give too much weight to a neighbourhood 

of the discontinuity set. 

Again, we want to differentiate between different degrees of recurrence in a similar way to the way 
we differentiated between different degrees of non-uniformity of the expansion. 

Definition 15. For x G M, we define the recurrence time function 

n{x) = min {iV > 1 : i " logdists{P {x),C) < 2e, Vn > iv} 

Then, for a map satisfying both conditions (*) and (**) we let 

= {x '■ £{x) > nor Tl{x) > n} 

Notice that £{x) and 7l{x) are finite almost everywhere and thus r„ —>■ 0. It turns out that the rate of 
decay of |r„| is closely related to the rate of decay of correlations. In the statement of the theorem we 
let C„ denote the correlation function for Holder continuous observables. 
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Theorem 10. Let f : M ^ M be a transitive local dijfeomorphism outside a non-degenerate 
critical set C, satisfying conditions (*) and (**). Then 

(1) [AlvBonViaOO] f admits an acip jj,. Some power of f is mixing 

(2) liAlvLuzPin03,AlvLuzPini[AlvLuzPindiml^ Suppose that there exists 7 > such that 

\Tn\ = O(n-T). 

Then 

Cn = 0{n-'+^). 

(3) \Gou04\l Suppose that there exists 7 > such that 

|r„| = o(e-^"). 

Then there exists 7' > such that 

Cn = 0(e-^'"). 

6.2. Viana maps. A main application of the general results described above are a class of maps 
known as Viana or Alves-Viana maps. Viana maps were introduced in [Via971 as an example of a 
class of higher dimensional systems which are strictly not uniformly expanding but for which the 
non-uniform expansivity condition is satisfied and, most remarkably, is persistent under small 
perturbations, which is not the case for any of the examples discussed above. These maps are defined 
as skew -products on a two dimensional cylinder of the form / :S^xlR^S^xlR 

f{e, x) = {k9, + a + e sin 27r9) 

where e is assumed sufficiently small and a is chosen so that the one-dimensional quadratic map 
X I— > + a for which the critical point lands after a finite number of iterates onto a hyperbolic 
repelling periodic orbit (and thus is a good parameter value and satisfies the non-uniform expansivity 
conditions as mentioned above). The map k6 is taken modulo 27r, and the constant k is a positive 
integer which was required to be > 16 in [ Via97 1 although it was later shown in [BuzSesTsu03 1 that 
any integer > 2 will work. The sin function in the skew product can also be replaced by more general 
Morse functions. 

Theorem 11. Viana maps 

• [Via97l satisfy (*) and (**). In particular they are nonuniformly expanding; 

• I .Alv00.AlvVia02V are topologically mixing and have a unique ergodic acip ( with respect to 
two-dimensional Lebesgue measure); 

• HAlvLuzPinV have super-polynomial decay of correlations: for any 7 > we have 

Cn = 0{n-^); 

• fBa lGouIWiGoud^ have stretched exponential decay of correlations: there exists 7 > such 
that 

7. Existence of Nonuniformly Expanding Maps 

An important point which we have not yet discussed is the fact that the verification of the nonuni- 
form expansivity assumptions is a highly non-trivial problem. For example, the verification that Viana 
maps are nonuniformly expanding is one of the main results of IIVia97l . Only in some special cases 
can the required assumptions be verified directly and easily. The definition of nonuniform expansivity 
is in terms of asymptotic properties of the map which are therefore intrinsically not checkable in any 
given finite number of steps. The same is true also for the derivative growth assumptions on the critical 
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orbits of one-dimensional maps as in Theorems[S]on page|^and|51on page|^ A perfecdy legitimate 
question is therefore whether these conditions actually do occur for any map at all. Moreover, recent 
results suggest that this situation is at best extremely rare in the sense that the set of one-dimensional 
maps which have attracting periodic orbits, and in particular do not have an acip, is open and dense in 
the space of all one-dimensional maps IGraSwi97l Lyu97[IKoz03[IShellKo zSheStr03 1. However, this 



topological point of view is only one way of defining "genericity" and it turns out that for general 
one-parameter families of one-dimensional maps, the set of parameters for which an acip does exist 
can have positive Lebesgue measure (even though it may be topologically nowhere dense). 

We give here a fairly complete sketch (!) of the argument in a special case, giving the complete 
description of the combinatorial construction and just brief overview of how the analytic estimates are 
obtained. For definiteness and simplicity we focus on the family 

fa{x) = x"^ - a 

for a; G / = [-2,2] and 

a E = [2 - e, 2] 

for some e > 0. 

Theorem 12 (fJakSl]). For every rj > there exists an e > and a set Q* C such that for all 
a E W, fa admits an ergodic absolutely continuous invariant probability measure, and such that 

\n*\ > (1 - ri)\n\e > 0. 

There exists several generalizations of this result for families of smooth maps IJak81[IBenCar85l 
|Ryc88 ThiTreYou94 MelStr93 Tsu93a Tsu93 1 and even to families with completely degenerate (flat) 
critical points Q^hu99| and to piecewise smooth maps with critical points ILuzVia00llLu zTuc99 1 . The 
arguments in the proofs are all fundamentally of a probabilistic nature and the conclusions depend 
on the fact that if / is nonuniformly expanding for a large number of iterates n then it has a "high 
probability" of being nonuniformly expanding forn + 1 iterates. Thus, by successively deleting those 
parameters which fail to be nonuniformly expanding up to some finite number of iterates has to delete 
smaller and smaller proportions. Therefore a positive proportion survives all exclusions. 

In section im we give the formal inductive construction of the set il*. In sections \T~2l and 1731 we 
prove the two main technical lemmas which give expansion estimates for orbit starting respectively 
outside and inside some critical neighbourhood. In section 17.41 we prove the inductive step in the 
definition of Q,* and in section 1731 we obtain the lower bound on the size of |. 

The proof involves several constants, some intrinsic to the family under consideration and some 
auxiliary for the purposes of the construction. The relationships between these constants and the 
order in which they are chose is quite subtle and also crucial to the argument. However this subtlety 
cannot easily be made explicit in such a sketch as we shall give here. We just mention therefore that 
there are essentially only two intrinsic constants: A which is the expansivity exponent outside some (in 
fact any) critical neighbourhood, and e which is the size of the parameter interval Q^. A can be chosen 
first and is essentially arbitrary as long as A E (0, log 2); e needs to be chosen last to guarantee that 
the auxiliary constants can be chosen sufficiently small. The main auxiliary constants are Aq which 
can be chosen arbitrarily in (0, A) and which gives the target Lyapunov exponent of the critical orbit 
for good parameters, and 

A>a>(5 = 5'»(5>0 

which are chosen in the order given and sufficiently small with respect to the previous ones. During 
the proof we will introduce also some "second order" auxiliary constants which depend on these. 
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Finally we shall use the constant C > to denote a generic constant whose specific value can in 
different formulae. 

7.1. The definition of Q,*. We let cq = co(a) = /a(0) denote the critical value of /„ and for z > 0, 
Ci = Ci{a) = P{co). For n > and a; C 17 let cj„ = {c„(a); a £ lo} C I. Notice that for a = 2 
the critical value maps to a fixed point. Therefore iterates of the critical point for parameter values 
sufficiently close to 2 remain in an arbitrarily small neighbourhood of this fixed point for an arbitrarily 
long time. In particular it is easy to see that all the inductive assumptions to be formulated below hold 
for all A; < where N can be taken arbitrarily large if s is small enough. This observation will play 
an important role in the very last step of the proof. 

7.1.1. Inductive assumptions. Let U^^^ = fl and V^^^ = {^2^°^} denote the trivial partition of O. 
Given n > 1 suppose that for each k < n — 1 there exists a set O^'^) C U satisfying the following 
properties. 

Combinatorics: For the moment we describe the combinatorial structure as abstract data, the 
geometrical meaning of this data will become clear in the next section. There exists a partition 
V^'^^ of O^'^) into intervals such that each uj G V^''^ has an associated itinerary constituted by 
the following information To each uj € V^^^ is associated a sequence = < < " " " < 
Or < k, r = r{ijj) > of escape times. Escape times are divided into three categories, i.e. 
substantial, essential, and inessential. Inessential escapes possess no combinatorial feature 
and are only relevant to the analytic bounded distortion argument to be developed later. Sub- 
stantial and essential escapes play a role in splitting itineraries into segments in the following 
sense. Let = 770 < 771 < ■ • • < rjs < k , s = s{io) > be the maximal sequence of sub- 
stantial and essential escape times. Between any of the two r?j_i and r]i (and between r/s and 
k) there is a sequence rn^i < z^i < • • • < z^j < r/j, t = t(c<j, i) > of essential return times 
(or essential returns) and between any two essential returns fj-i and Vj (and between vt and 
r/i) there is a sequence Vj-i < /xi <•••<//„< J^j, m = u(a;, > of inessential return 
times (or inessential returns). Following essential and inessential return (resp. escape) there 
is a time interval [vj + 1, Uj + pj\ (resp. [^j + 1, ^ij + pj\ ) with pj > called the binding 
period. A binding period cannot contain any return and escape times. Finally, associated to 
each essential and inessential return time (resp. escape) is a positive integer r called the return 
depth (resp. escape depth). 

Bounded Recurrence: We define the function S^'^^ : fi^*^) — > IM which associates to each a G 
the total sum of all essential return depths of the element u G V^'^'' containing a in its 
itinerary up to and including time k. Notice that S^''^ is constant on elements of V^''^ by 
construction. Then, for all a G fi*^*^) 

{BR)k £^^\a) < ak 

Slow Recurrence: For all a G ^^'^^ and alH < A; we have 

iSR)k \ci{a)\ > e-°^ 

Notice that a can be chosen arbitrarily small as long as s is small in order for this to hold for 
all i < N. 
Hyperbolicity: For all a G $7^*^) 



|(/r')'(co)|>Ce^°('+^). 
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Bounded Distortion: Critical orbits with the same combinatorics satisfy uniformly comparable 

derivative estimates: For every uo G pC^), every pair of parameter values a,b ^ lo and every 
j < u + p + I where u is the last return or escape before or equal to time k and p is the 
associated binding period, we have 

(BDU Mim<V and 

Moreover if A; is a substantial escape a similar distortion estimate holds for all j < I (I is 
the next chopping time) replacing Vhy V and co by any subinterval uj' C oo which satisfies 
uj'i C A"*". In particular for j < k, the map cj : u! ^ ujj = {cj{a) : a G a;} is a bijection. 

7.1.2. Definition ofn^""^ and P^^'l For re hi, let Ir = [e-^ 6"''+^), I-r = -Ir and define 
A+ = {0} U (J Ir and A = {0} U (J Ir- 

\r\>r^++l \r\>rs+l 

where = \og5~^ ,r^+ = ilog(5~^. We can suppose without loss of generality that rs,r^+ G INJ. 
For technical reasons related to the distortion calculation we also need to subdivide each into 
subintervals of equal length. This defines partitions X,X+ of A+ with 1 = X+|a. An interval 
belonging to either one of these partitions is of the form ir,m with m G [1, r^]. Let J^„j and Ir,m 
denote the elements of T+ adjacent to I^.m and let /^.m = Ir,m U ^r,m U /r,m- If Ir.m happens to 
be one of the extreme subintervals of then let or Ir.m, depending on whether „ is a left 

or right extreme, denote the intervals (—6'' — (i^^^f^ip-, —S"" or S'', 5'' + nno-l-M^ ) respectively We 



(log 5-^2 



now use this partition to define a refinement V'^'^^ of T'*-" Let u> G V'^'^ We distinguish two 
different cases. 

Non-chopping times: We say that n is a non-chopping time for oj G 7^("~^) if one (or more) 
of the following situations occur: (1) oj^ n A"*" = 0; (2) n belongs to the binding period 
associated to some return or escape time v < n of lu; (3) ujn H A"'' 7^ but aj„ does not 
intersect more than two elements of the partition J+. In all three cases we let cv G "P^"). In 
cases (1) and (2) no additional combinatorial information is added to the itinerary of u. In 
case (3), if a;„ PI (A U I±rg ) 7^ (resp. a;„ C A+ \ ( A U /±r^), we say that n is an inessential 
return time (resp. inessential escape time ) for a; G We define the corresponding depth 
by r = max{|r| : Ir 0}- 

Chopping times: In all remaining cases, i.e. if a;„ PI A+ 7^ and a;„ intersects at least three 
elements of X+, we say that n is a chopping time for cj G 'p("'~^\ We define a natural 
subdivision 

iu = Ju [j a;(^'"*) U co". 

{r,m) 

SO that each Wn'^^ fully contains a unique element of X+ (though possibly extending to in- 
tersect adjacent elements) and oj^ and uJn are components of cun \ H uJn) with > 
6''/ (log 6~'-)'^ and \iOn\ > S''/ (log 6~'-)'^. If the connected components of a;„ \ (A+ U tOn) fail 
to satisfy the above condition on their length we just glue them to the adjacent interval of the 
form Wn'^^ ■ By definition we let each of the resulting subintervals of u be elements of V^^^ . 
The intervals uj^,ujP and oj with \r\ < are called escape components and are said to 
have an substantial escape and essential escape respectively at time n. The corresponding 
values of |r| < are the associated essential escape depths. All other intervals are said to 
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have an essential return at time n and the corresponding values of |r| are the associated es- 
sential return depths. We remark that partition elements I±r^ do not belong to A but we still 
say that the associated intervals oj^^"^^'^^ have a return rather than an escape. 

This completes the definition of the partition "P^") of and of the function £^^^ on 

We define 

(27) = {a G : f (")(a) < an} 

Notice that E^"^^ is constant on elements of "P^"). Thus il^") is the union of elements of "P*^") and we 
can define 

Notice that the combinatorics and the recurrence condition {BR)n are satisfied for every a € 0^"^ by 
construction. In Section ITU we shall prove that conditions {EG)n, {SR)n, {BD)n all hold for 
Then we define 

n>0 

In particular, for every a G il* , the map fa has an exponentially growing derivative along the critical 
orbit and thus, in particular, by Lemma [8l admits an ergodic acip. In Section 17.51 we prove that 
> 0. 

We recall that a sketch of the proof of the existence of an induced Marov map under precisely the 
hyperbolicity and slow recurrence assumptions given here is carried out in section l531 As mentioned 
there, the strategy for construction of the induced Markov map is remarkably similar to the strategy 
for the construction carried out here for estimating the probability that such conditions hold. The 
deeper meaning of this similarity is not clear. 

7.2. Expansion outside the critical neighbourliood. On some deep level, the statement in the The- 
orem depends essentially on the following result which we have already used in section 1531 

Lemma 7.1. There exists a constant C > independent of 5 such that for e > sufficiently small, 
all a G r^e, f = fa, X G I and n > 1 such that x, f{x), .., f^~^{x) ^ Awe have 

|I?r(x)| >5e"" 

and if, moreover, f^{x) £ and/or x G /(A+) then 

|^r(x)| > Ce^"" 

In the proof of the theorem we will use some other features of the quadratic family and of the spe- 
cific parameter interval 0.^ but it is arguable that they are inessential and that the statement of Lemma 
17. H are to a certain extent sufficient conditions for the argument. It would be very interesting to try to 
prove the main theorem using only the properties stated in Lemma |7!T] On a general "philosophical" 
level, the idea, as I believe originally articulated explicitly by L.-S. Young, is that 

Uniform Hyperbolicity for all parameters in most of the state space 

implies 

Nonuniform Hyperbolicity in all the state space for most parameters. 



36 



STEFANO LUZZATTO 



7.3. The binding period. Next we make precise the definition of the binding period wliich is part 
of the combinatorial information given above. Let k < n — 1, oj ^ V^^^ and suppose that k is an 
essential or inessential return or escape time for lo with return depth r. Then we define the binding 
period of as 

p{ujk) = min{p(cfc(a))} 

where 

(28) p{ck{a)) = min{i : |cfe+i+i(a) - Ci{a)\ > e"^"*}. 

This is the time for which the future orbit of Cfe(a) can be thought of as shadowing or being bound 
to the orbit of the critical point (that is, in some sense, the number of iterations for which the orbit 
of c(a) repeats its early history after the /c'th iterate). We will obtain some estimates concerning the 
length of this binding period and the overall derivative growth during this time. 

Lemma 7.2. There exist constants tq > Oand'ji £ (0,1) such that the following holds. Letk < n—1, 
p(^) and suppose that k is an essential or inessential return or escape time for uj with return 
depth r. Let p = p{ujk)- Then for every a £ uj we have 

(29) P<TQ\o^\ck{a)\-^ <k 
and 

(30) \DfP+^{ck{a))\ > Ce'^(^-^i) > C7e^^^+^^ 
and, if k is an essential return or an essential escape, then 

(31) \oJk+p+i\ > Ce-^'\ 

To simplify the notation we write x = Ck{a) and xq = Ck+i{a) and omit the dependence on the 
parameter a where there is no risk of confusion. The first step in the proof is to obtain a bounded 
distortion estimate during binding periods: there exists a constant Pi(ao, ai) independent of x, such 
that for all a G all yo; -^o G i^o, co] and all < j < mm{p — 1, fc} we have 



(P')'(yo) 

This follows from the standard distortion calculations as in ^ on page using the upper bound 
g-2ai fj-Qjjj {j^g definition of binding in the numerator and the lower bound e~"* from the bounded 
recurrence condition {SR)k in the denominator. Notice that for this reason the distortion bound is 
formally calculated for iterates j < min{p — l,k} (the bounded recun^ence condition cannot be 
guaranteed for iterates larger than k). The next step however gives an estimate for the duration of the 
binding period and implies that p < k and therefore the distortion estimates do indeed hold throughout 
the duration of the binding period. The basic idea for the upper bound on p is simple. The length of 
the interval [xq, cq] is determined by the length of the interval [x, a] which is Ck{a). The exponential 
growth of the derivative along the critical orbit and the bounded distortion imply that this interval is 
growing exponentially fast. The condition which determines the end of the binding period is shrinking 
exponentially fast. Some standard mean value theorem estimates using these two facts give the result. 
Finally the average derivative growth during the binding is given by the combined effect of the small 
derivative of order Cfc(a) at the return to the critical neighbourhood and the exponential growth during 
the binding period. The result then intuitively boils down to showing that the binding period is long 
enough to (over) compensate the small derivative at the return. 
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The final statement in tlie lemma requires some control over the way that the derivatives with 
respect to the parameter are related to the standard derivatives with respect to a point. This is a fairly 
important point which will be used again and therefore we give a more formal statement. 

Lemma 7.3. There exists a constant P2 > such that for any I <k <n-l,uj e and a £ to 

we have 

J. > |Cfc(«)l > ^ 
l^/a(co)| 

and, for all 1 < i < j < k + 1, there exists a £ uj such that 

(32) :^\Dfr{c.m < < V2\Dfr{cra))\ 

Proof. The second statement is a sort of parameter mean value theorem and follows immediately 
from the first one and the standard mean value theorem. To prove the first one let F : O x / ^ / be 
the function of two variables defined inductively by F(a, x) = fa{x) and F''{a, x) = F{a, fj^^^x). 
Then, for x = cq, we have 

4(a) = a„F^X«,co) = daF{ajt'co) = -1 + f'a{ck^i)c'k_,{a). 

Iterating this expression gives 

-4(a) = 1 + /a(Cfc-l) + /a(Cfc-l)/a(cfc-2) + • • • 
• • • + /a(Cfc-l)/a(cfe-2) • • • /a(ci)/a(co) 

and dividing both sides by (/^)'(co) = /^(cfc_i)/^(cfc_2) • . . /a(ci)/a(co) gives 

(33) A^ = i + y^__ 

The result then depends on making sure that the sum on the right hand side is bounded away from 
-1. Since the critical point spends an arbitrarily large number N of iterates in an arbitrarily small 
neighbourhood of a fixed point at which the derivative is —4 we can bound an arbitrarily long initial 
part of this sum by —1/2. By the exponential growth condition the tail of the sum is still geometric 
and by taking large enough we can make sure that this tail is less than 1/2 in absolute value. □ 

Returning to the proof of Lemma FOl we can use the parameter/space derivative bound to extend 
the derivative expansion result to the entire interval uJk and therefore to estimate the growth of this 
interval during the binding period. 

7.4. Positive exponents in dynamical space. Using a combination of the expansivity estimates out- 
side A and the binding period estimates for returns to A it is possible to prove the inductive step stated 
above. 

The slow recurrence condition is essentially an immediate consequence of the parameter exclusion 
condition. 

The exponential growth condition relies on the following crucial and non-trivial observation: the 
overall proportion of bound iterates is small. This follows from the parameter exclusion condition 
which bounds the total sum of return depths (an estimate is required to show that inessential return do 
not contribute significantly to the total) and the binding period estimates which show that the length of 
the binding period is bounded by a fraction of the return depth. This implies that the overall derivative 
growth is essentially built up from the free iterates outside A and this gives an overall derivative 
growth at an exponential rate independent of n. 
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The bounded distortion estimates again starts with the basic estimate as in ((SJl on page By 
Lemma FOl it is sufficient to prove the estimate for the space derivatives Df^; intuitively this is saying 
that critical orbits with the same combinatorics satisfy the same derivative estimates. The difficulty 
here is that although the images of parameter intervals uj are growing exponentially, they do not satisfy 
a uniform backward exponential bound as required to carry out the step leading to ^ on page^J also 
images of u can come arbitrarily close to the critical point and thus the denominator does not admit 
any uniform bounds. The calculation therefore is technically quite involved and we refer the reader 
to published proofs such as |LuzOO| for the details. Here we just mention that the argument involves 
decomposing the sum into "pieces" corresponding to free and bound iterates and estimating each one 
independently, and taking advantage of the subdivision of the critical neighbourhood into interval 
each of which is crucially further subdivided into further subintervals of equal length. This implies 
that the contribution to the distortion of each return is at most of the order of instead of order 1 
and allows us to obtain the desired conclusion using the fact that 1 /r2 is summable in r. 

7.5. Positive measure in parameter space. Recall that "P*^") is the partition of which takes 

into account the dynamics at time n and which restricts to the partition of fi^"^ after the exclusion 
of a certain elements of Our aim here is to develop some combinatorial and metric estimates 

which will allow us to estimate the measure of parameters to be excluded at time n. 

The first step is to take a fresh look at the combinatorial structure and "re-formulate it" in a way 
which is more appropriate. To each uj G "P^") is associated a sequence = r/o < t/i < • • • < 
r]s < n, s = s{lo) > of escape times and a corresponding sequence of escaping components 
u C uj^^^^^ C . . . C uj^'^o) with a;(''') C and G P^*?'). To simphfy the formaUsm we 

also define some "fake" escapes by letting ut^^^^ = u; for all s + 1 < i < n. In this way we have 
a well defined parameter interval oj^"^^^ associated to a; G V^"^^ for each < i < n. Notice that for 
two intervals uj,ui G V^"^ and any < i < n, the corresponding intervals uj^^^^ and u)^''') are either 
disjoint or coincide. Then we define 

and let Q^*) = {to^^^^} denote the natural partition of Q^*^ into intervals of the form uj^^^\ Notice 
that = C . . . C Q(o) = and Q(") = p(") since the number s of escape times 

is always strictly less than n and therefore in particular 

ujM = ^ for all UJ G p("). For a given 

UJ = uj^-n') (z Qii)^ < i < n - 1 we let 

q(^+i)(w) = {io' = G Q(^+^) -.uj' Cuj} 

denote all the elements of Q(*+^) which are contained in uj and let Q(*+i) (uj) denote the corresponding 
partition. Then we define a function Af(*) : Q^'+'^\uj) ^ INJ by 

This gives the total sum of all essential return depths associated to the itinerary of the element uj' G 
containing a, between the escape at time r]i and the escape at time ?7i+i. Clearly A£^^\a) 
is constant on elements of Q^^~^^\uj). Finally we let 

Q^'+^\uj,R) = {uj' G Q('^+i) : uj' C uj, AS^''^ {uj') = R}. 

Notice that the entire construction given here depends on n. The main motivation for this construction 
and is the following 
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Lemma 7.4. There exists a constant 70 G (0, 1 — 71) such that the following holds. For all i < n — 1, 
uj G Q(*) andR>0 we have 

(34) \^\ < e'-^'+"">-^^^\iu\. 

(i,eQ{i+i)((^,_R) 

This says essentially that the probability of accumulating a large total return depth between one 
escape and the next is exponentially small. The strategy for proving this result is straightforward. We 
show first of all that for < i < n - 1, w e Q^'), R>Oa.ndoj G Q(*+^) (w, R) we have 

(35) |w| < e(^i-^)^|u;|. 

The proof is not completely straightforward but depends on the intuitively obvious fact that an interval 
which has a deep return must necessarily be very small (since it is only allowed to contain at most 
three adjacent partition elements at the return). Notice moreover that this statement on its own is 
not sufficient to imply d34t as there could be many small intervals which together add up to a lot 
of intervals having large return. However we can control to some extent the multiplicity of these 
intervals and show that we can choose an arbitrarily small 70 (by choosing the critical neighbourhood 
A sufficiently small) so that for all < i < n — 1, w G Q^*) and R > rg, we have 

(36) #Q(:+i)(^,i?)<e^«^. 

This depends on the observation that each lo has an essentially unique (uniformly bounded multi- 
plicity) sequence of return depths. Thus the estimate can be approached via purely combinatorial 
arguments very similar to those used in relation to equation (fTTl . Choosing 6 small means the se- 
quences of return depths have terms bounded below by rs which can be chosen large, and this allows 
the exponential rate of increase of the combinatorially distinct sequences with R to be taken small. 
Combining d35l and d36b immediately gives d34l i. 
Now choose some 72 G (0, 1 — 70 — 71) and let 

7 = 70 + 71 + 72 > 0. 

For < i < n — 1, and lj G Q^*\ write 

^ e72A^«K)|^'| = ^ l^'l + ^ e'^2R ^ |^/|. 

cj'eS(*+i)H a;'eS('+i)(w,0) R>rs a;'eS(*+l)(a;,R) 

By (l34l we then have 



UJ 



(37) Yl l^'l + Yl """"^ I'^'l ^ h + 5^e(^»+^i+^2-i)^ 
Since = A£:(°) H h A£:("-i) and A£"(*) is constant on elements of Q(*) we can write 

y^g72A£("-i)(a;("-i))y^g72Ag("-l)(<.^("))|^| 

Notice the nested nature of the expression. Applying (l37l repeatedly gives 

(38) / e^^^^'"' = j;e^^^'"'(-)|^| < (l + ^'^''''''A 
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The definition of ^l^'"'^ gives 

and therefore using Chebyshev's inequality and we have 



-72Q! 



1 + ^ e(^-i)^ 



which implies 



and thus 



-720 



1 + J2 e(^-i)^ 




-72a 



,(7-l)« 



Choosing N sufficiently large, by taking e sufficiently small, guarantees that the right hand side is 
positive. 

8. Conclusion 

In this final section we make some concluding remarks and present some questions and open prob- 
lems. 

8.1. What causes slow decay of correlations ? The general theory described in section|6lis based 
on a certain way of quantifying the intrinsic nonuniformity of / which does not rely on identifying 
particular critical and/or neutral orbits. However, the conceptual picture according to which slow 
rates of decay are caused by a slowing down process due to the presence of neutral orbits can also be 
generalized. Indeed, the abstract formulation of the concept of a neutral orbit is naturally that of an 
orbit with a zero Lyapunov exponent. The definition of nonuniform expansivity implies that almost 
all orbits have uniformly positive Lyapunov exponents but this does not exclude the possibility of 
some other point having a zero Lyapunov exponent. It seems reasonable to imagine that a point with 
a zero Lyapunov exponent could slow down the overall mixing process in a way which is completely 
analogous to the specific examples mentioned above. Therefore we present here, in a heuristic form, 
a natural conjecture. 

Conjecture 1. Suppose f is non-uniformly expanding. Then f has exponential decay of correlations 
if and only it has no orbits with zero Lyapunov exponent. 

An attempt to state this conjecture in a precise way reveal several subtle points which need to 
be considered. We discuss some of these briefly. Let J\4 denote the space of all probability /- 
invariant measures n on M which satisfy the integrability condition f log HD/ajUfi^ < 00. Then by 
standard theory, see also IBarPes04 ' Section 5.8], we can apply a version of Oseledet's Theorem for 
non-invertible maps which says that there exist constants Ai, . . . , with k < d, and a measurable 
decomposition T^M = Ej. (B ■■■ (B of the tangent bundle over M such that the decomposition is 
invariant by the derivative and such that for all j = 1, . . . ,k and for all non zero vectors v^^^ £ Ei we 
have 



1 



n— 1 



lim -2^1ogP/;(^;(^-))|| 
=0 



A. 



n—>oo n 
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The constants Ai, . . . , are called the Lyapunov exponents associated to the measure //. The defini- 
tion of nonuniform expansivity implies that all Lyapunov exponents associated to the acip ;U are > A 
and thus uniformly positive, but it certainly does not exclude the possibility that there exist some other 
(singular with respect to Lebesgue) invariant probability measure with some zero Lyapunov exponent. 
This is the case for example for the maps of Section|4]for which the Dirac measure on the indifferent 
fixed point has a zero Lyapunov exponent. 

Thus one way to state precisely the above conjecture is to claim that / has exponential decay of cor- 
relations if and only if all Lyapunov exponents associated to all invariant probability measure in M. are 
uniformly positive. Of course, a priori, there may also be some exceptional points, not typical for any 
measure in M., along whose orbit the derivative expands subexponentially and which therefore might 
similarly have a slowing down effect. Also it may be that one zero Lyapunov exponent along one spe- 
cific direction may not have a significant effect whereas a measure for which all Lyapunov exponents 
were zero would. Positive results in the direction of this conjecture include the remarkable observa- 
tion that local diffeomorphisms for which all Lyapunov exponents for all measures are positive, must 
actually be uniformly expanding | AlvAraSau03 Cao03 CaoLuzRioH}^ and thus in particular have 
exponential decay of correlations. Moreover, in the context of one-dimensional smooth maps with 
critical points it is known that in the unimodal case exponential growth of the derivative along the crit- 
ical orbit (the Collet-Eckmann condition) implies uniform hyperbolicity on periodic orbits |Now88 1 
which in turn implies that all Lyapunov exponents of all measures as positive |BruKel98l and the 
converse is also true INowSan98l . Thus conjecture 1 is true in the one-dimensional unimodal setting. 

We remark that the assumption of non-uniform expansion is crucial here. There are several exam- 
ples of systems which have exponential decay of correlations but clearly have invariant measures with 
zero Lyapunov exponents, e.g. partially hyperbolic maps or maps obtained as time-1 maps of certain 
flows rDol98a"Dol98b DolOO]. These examples however are not non-uniformly expanding, and are 
generally partially hyperbolic which means that there are two continuous subbundles such that the 
derivative restricted to one subbundle has very good expanding properties or contracting properties 
and the other subbundle has the zero Lyapunov exponents. For reasons which are not at all clear, 
this might be better from the point of view of decay of correlations than a situation in which all the 
Lyapunov exponents of the absolutely continuous measure are positive but there is some embedded 
singular measure with zero Lyapunov exponent slowing down the mixing process. Certainly there is 
still a lot to be understood on this topic. 

8.2. Stability. The results on the existence of nonuniformly expanding maps for open sets or positive 
measure sets of parameters are partly stability results. They say that certain properties of a system, 
e.g. being nonuniformly expanding, are stable in a certain sense. We mention here two other forms of 
stability which can be investigated. 

8.2.1. Topological rigidity. The notion of (nonuniform) expansivity is, a priori, completely metri- 
cal: it depends on the differentiable structure of / and most constructions and estimates related to 
nonuniform expansivity require delicate metric distortion bounds. However the statistical properties 
we deduce (the existence of an acip, the rate of decay of correlation) are objects and quantities which 
make sense in a much more general setting. A natural question therefore is whether the metric prop- 
erties are really necessary or just very useful conditions and to what extent the statistical properties 
might depend only on the underlying topological structure of /. We recall that two maps f : M M 
and g : N N aie. topologically conjugate, / ^ (7, if there exists a homeomorphism h : M ^ N 
such that ho f = g o h. We say that a property of / is topological or depends only on the topological 
structure of / if it holds for all maps in the topological conjugacy class of /. 
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The existence of an absolutely continuous invariant measure is clearly not a topological invariant in 
general: if /xj is an acip for / then we can define fig = h*fif by iJ-g{A) = fif{h~^{A)) which gives an 
invariant probability measure but not absolutely continuous unless the conjugating homeomorphism h 
is itself absolutely continuous. For example the map of Theorem|6lhas no acip even though it is topo- 
logically conjugate to a uniformly expanding Markov map. However it turns out, quite remarkably, 
that there are many situations in the setting of one-dimensional maps with critical points in which the 
existence of an acip is indeed a topological property (although there are also examples in which it is 
not lBru98a|). Topological conditions which imply the existence of an acip for unimodal maps were 
given in |Bru94.San95 BruQSQ. In ^owPrz98 1 (bringing together results of |NowSan98 PrzRoh98 1) 
it was shown that the exponential growth condition along the critical orbit for unimodal maps (which 
in particular implies the existence of an acip, see Theorem[8l) is a topological property. A counterex- 
ample to this result in the multimodal case was obtained in IPrzRivSmi03l . However it was shown in 
ILuzWan031 that in the general multimodal case, if all critical points are generic with respect to the 
acip, then the existence of an acip still holds for all maps in the same conjugacy class (although not 
necessarily the genericity of the critical points). 

We emphasize that all these results do not rely on showing that all conjugacies in question are 
absolutely continuous. Rather they depend on the existence of some topological property which forces 
the existence of an acip in each map in the conjugacy class. These acip's are generally not mapped to 
each other by the conjugacy. 

8.2.2. Stochastic stability. Stochastic stability is one way to formalize the idea that the statistical 
properties of a dynamical systems are stable under small random perturbations. There are several pos- 
itive results on stochastic stability for uniformly expanding IYou86bllB alYou93[ICow00ll and nonuni- 
formly expanding maps in dimension 1 I B al Via96l lAraLuz Vial and higher lAraO 1 [ lAlv Via02l See 
IaTvOBJ for a comprehensive treatment of the results. 

8.3. Nonuniform hyperbolicity and induced Markov maps. The definition of nonuniform hyper- 
bolicity in terms of conditions (*) and (**) given above are quite natural as they are assumptions 
which do not a priori require the existence of an invariant measure. However they do imply the 
existence of an acip /i which has all positive Lyapunov exponents. Thus the system (/, /x) is also 
nonuniformly expanding in the more abstract sense of Pesin theory, see rBarPes04|. The systematic 
construction of induced Markov maps in many examples and under quite general assumptions, as 
described above, naturally leads to the question of whether such a construction is always possible in 
this abstract setting. Since the existence of an induced Markov map implies nonuniform expansiv- 
ity this would essentially give an equivalent characterization of nonuniform expansivity. A general 
result in this direction has been given for smooth one-dimensional maps in |San03|. It would be in- 
teresting to extend this to arbitrary dimension. A generalization to nonuniformly hyperbolic surface 
diffeomorphisms is work in progress [LuzSan]. 

It seems reasonable to believe that the scope of application of induced Markov towers may go 
well beyond the statistical properties of a map /. The construction of the induced Markov map in 
ISan0 3!l for example is primarily motivated by the study if the Hausdorff dimension of certain sets. A 
particularly interesting application would be a generalization of the existence (parameter exclusion) 
argument sketched in section Even in a very general setting, with no information about the map 
/ except perhaps the existence of an induced Markov map, it is natural to ask about the possible 
existence of induced Markov maps for small perturbations of /. If, moreover, the existence of an 
induced Markov map were essentially equivalent to nonuniform expansivity then this would be a 
question about the persistence of nonuniform expansivity under small perturbations. 
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Conjecture 2. Suppose that f is nonuniformly expanding. Then sufficiently small perturbations of f 
have positive probability of also being nonuniformly expanding. 

Using the Markov induced maps one could define, even in a very abstract setting, a critical region 
A formed by those points that have very large return time. Then outside A one would have essentially 
uniform expansivity and these, as well as the Markov structure, would essentially persist under small 
perturbations. One could then perturb / and, up to parameter exclusions, try to show that the Markov 
structure can be extended once again to the whole of A for some nearby map g. 

8.4. Verifying nonuniform expansivity. The verification of the conditions of nonuniform hyperbol- 
icity are a big problem on both a theoretical and a practical level. As mentioned in Section for 
the important class of one-dimensional maps with critical point, nonuniform expansivity occurs with 
positive probability but for sets of parameters which are topologically negligible and thus essentially 
impossible to pinpoint exactly. The best we can hope for is to show they occur with "very high" 
probability in some given small range of parameter values. 

However even this is generally impossible with the available techniques. Indeed, all existing ar- 
gument rely on choosing a sufficiently small parameter interval centred on some sufficiently good 
parameter value. The closeness to this parameter value is then used to obtain the various conditions 
which are required to start the induction. However the problem then reduces to showing that such 
a good parameter value exists in the particular parameter interval of interest, and this is again both 
practically and theoretically impossible in general. Moreover, even if such a parameter value was de- 
termined (as in the special case of the "top" quadratic map) existing estimates do not control the size 
of the neighbourhood in which the good parameters are obtained nor the relative proportion of good 
parameters. For example there are no explicit bounds for the actual measure of the set of parameters 
in the quadratic family which have an acip. A standard coffee-break joke directed towards authors of 
the papers on the existence of such maps is that so much work has gone into proving the existence 
of a set of parameters which as far as we know might be infinitesimal. Moreover, there just does not 
seem to be any even heuristic argument for believing that such a set is or isn't very small. Thus, for 
no particular reason other than a reaction to these coffee-break jokers (!), we formulate the following 

Conjecture 3. The set of parameters in the quadratic family which admit an acip is "large". 

For definiteness let us say that "large" means at least 50% but it seems perfectly reasonable to 
expect even 80% or 90%, and of course we mean here those parameters between the Feigenbaum 
period doubling limit and the top map. An obvious strategy for proving (or disproving) this conjecture 
would be to develop a technique for estimating the proportion of maps having an acip in any given 
small one-parameter family of maps. The large parameter interval of the quadratic family could then 
be subdivided into small intervals each and the contribution of each of these small intervals could then 
be added up. 

A general technique of this kind would also be interesting in a much broader context of appli- 
cations. As mentioned in the introduction, many real-life systems appear to have a combination of 
deterministic and random-like behaviour which suggests that some form of expansivity and/or hyper- 
bolicity might underly the basic driving mechanisms. In modelling such a system it seems likely that 
one may obtain a parametrized family and be interested in a possibly narrow range of parameter val- 
ues. It would be desirable therefore to be able to obtain a rigorous prove of the existence of stochastic 
like behaviour such as mixing with exponential decay of correlations in this family and to be able 
to estimate the probability of such behaviour occurring. An extremely promising strategy has been 
proposed recently by K. Mischaikow. The idea is to combine non-trivial numerical estimates with 
the geometric and probabilistic parameter exclusion argument discussed above. Indeed the parameter 
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exclusion argument, see section relies fundamentally on an induction which shows that the proba- 
bility of being excluded at time n are exponentially small in n. The implementation of this argument 
however also requires several delicate relations between different system constants to be satisfied and 
in particular no exclusions to be required before some sufficiently large so that the exponentially 
small exclusions occurring for n > cannot cumulatively add up to the full measure of the param- 
eter interval under consideration. The assumption of the existence of a particularly good parameter 
value a* and the assumption that the parameter interval is a sufficiently small neighbourhood of a* 
are used in all existing proofs to make sure that certain constants can be chosen arbitrarily small or 
arbitrarily large thus guaranteeing that the necessary relations are satisfied. Mischaikow's suggestion 
is to reformulate the induction argument in such a way that the inductive assumptions can, at least in 
principle, be explicitly verified computationally. This requires the dependence of all the constants in 
the argument to be made completely explicit in such a way that the inductive assumptions boil down 
to a finite set of open conditions on the family of maps which can be verified with finite precision in 
finite time. 

Besides the interest of the argument in this particular setting this could perhaps develop into an 
extremely fruitful interaction between the "numerical" and the "geometric/probabilistic" approach 
to Dynamical Systems, and contribute significantly to the applicability of the powerful methods of 
Dynamical Systems to the solution and understanding of real-life phenomena. 
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